MovieStoreGuy commented 3 years ago

Profiling events

There is a shifting concept that performance monitoring and application monitoring (the idea of tracking the time spent in functions and or methods, vs how long it takes to serve a request) are near identical and come under the realm of Observability (understanding how your service is performing).

How is this different from tracing

Conventional tracing looks at showing the user's request flow through the application to show time spent in different operations. However, this can miss any background operations that indirectly impact the user request flow.

ie. If I take a rate limiting service that has a background sync to share state among other nodes:

func ShouldRateLimit(next http.Handler) http.Handler {
   return http.HandlerFunc(w http.ResponseWriter, r *http.Request) {
         span, ctx := otel.SpanFromContext(r.Context())
         defer span.Finish()
         key, err := ratelimit.GetKey(r)

        if limits.Key(key).Exceed() {
             // return 429 status code
        }
        next.ServeHTTP(w,r)
   })
}

func (l *limits) SyncLimits() {
    l.cache.RLock()
    defer l.cache.RUnlock()
    for _, limit := limits.cache {
          // publish data to each node or distributed cache
          // Update internal values with shared updates
    }
}

In the above example, I can clearly see how the function ShouldRateLimit impacts the requests processing time considering the context used as part of the request can be used to link spans together but there is a hidden cost here with SyncLimits that currently can not be exposed due to the fact it runs independently from in bound requests and thus can not / should not share the same context.

Now, the SyncLimits function could implement metrics to help expose runtime performance issues but could be problematic due to:

As a developer, I need to know what to start observing in order to diagnose
The problem may disappear due to the nature of the issue (race conditions, Heisenbug)
Measure performance of my function comparatively to my entire application
Can not easily measure deadlocks / livelocks without elaborate code orchestration

Suggestion

At least within the golang community, https://github.com/google/pprof has been the leading tool in order to facilitate these kinds of questions while also offering first part support within Go. Moreover, AWS also have their own solution https://aws.amazon.com/codeguru/ that offers something similar for JVM based applications.

Desired outcomes of data:

Show cumulative runtime of functions (could also derive percentage from this data)
Map resource usage (CPU, Memory, and I/O) to internal methods / functions

Desired outcomes of orchestration:
Low friction with adding profiling support (As an example, pprof adds a single handler to perform software based profiling)
Should not require major modifications of existing code to work (should not require adding functions that would complicate existing logic)

I understand that software based profiling is not 100% accurate as per the write up here https://go.googlesource.com/proposal/+/refs/changes/08/219508/2/design/36821-perf-counter-pprof.md however, this could give an amazing insight into hidden application performance that could help increase reliability, performance and discover resource issues that were hard to discover with the existing events being emitted.

jkwatson commented 3 years ago

FYI, JFR is probably the top JVM profiling tool, as it's built-in to the JVM these days.

MovieStoreGuy commented 3 years ago

That is awesome to know @jkwatson :D I don't often work with JVM based languages but I will 100% have a look :D

iNikem commented 3 years ago

@jkwatson @MovieStoreGuy The top profiling tool for JVM is async-profiler :)

jkwatson commented 3 years ago

@jkwatson @MovieStoreGuy The top profiling tool for JVM is async-profiler :)

The docs on that are seriously out of date...they still reference JFR as a commercial product. I guess that's true if you're profiling java 7, but I don't wish that on anyone.

iNikem commented 3 years ago

What docs are out of date?

jkwatson commented 3 years ago

well, now I can't find the ones I was just looking at, so /shrug. Also, this probably isn't the place to argue about specific profiling tools. :)

MovieStoreGuy commented 3 years ago

I agree with @jkwatson, I appreciate bring JVM tools to my intention, that is not the focus of this proposal :)

rakyll commented 3 years ago

We're interested in being able to collect CPU, memory, contention and other profiles with OpenTelemetry and have representations of profiles in OTLP and support in the collector. We are currently also looking into existing data model alternatives such as pprof as an option given its wide use in open source and language support.

We want to enable cases where we can use OpenTelemetry attributes to label profiles as well. pprof has support for labelling (an example can be seen at https://rakyll.org/profiler-labels/).

As of today, it's very difficult for our users to enable profiling at a later time, especially in production. They need to add CodeGuru Profiler libraries, rebuild and redeploy. As more and more of them are linking OpenTelemetry for other telemetry collection, we want to enable cases where we can enable profile collection dynamically in runtime. This use case will require the OpenTelemetry client libraries to speak to the collector (or another control plane) to enable/disable collection.

thegreystone commented 3 years ago

Not sure if this will help, but I thought I'd chip in with what we're doing at Datadog. For the continuous profiler (which is integrating with our tracer), we're using our own profiling libraries for most platforms, and our own agent using JFR on the JVM. For the JVM we've added our own profiling events for various different kinds of profiling (e.g. rate limited exception profiling). For non-JVM languages we're partly using pprof as the serialization format (some data doesn't fit well into the model, so it's currently an archive with multiple files in it). For the JVM we're using JFR for the serialization format.

There are a few interesting initiatives for JFR in recent and upcoming versions - such as a new allocation profiler in JDK 16, and much faster stack trace capturing (I believe JDK 17). We (Datadog), are also considering contributing an all new, full process, proper CPU profiler and some neat new capabilities allowing you to, for example, easily implement your own dynamic wall clock profiler.

MovieStoreGuy commented 3 years ago

It has been sometime since I have opened this, but I'd like to know how I could speed up anything that is required to make this part of the default otel offering.

tedsuo commented 3 years ago

Hi @MovieStoreGuy. We're pretty heads down getting metrics and logs completed, as well as expanding and improving library instrumentation. There probably will not be a lot of bandwidth from the current community until these components are stable; apologies in advance, it will probably be slow going. However, profiling is definitely top priority after metrics and logs!

If you, @thegreystone, and others are interested in contributing work towards this project, I would suggest the following steps, which any new signal would need to take:

1) Create a prototype in (ideally) two or three languages. 2) Write an OTEP with the proposed specification, based on those prototypes.

If there is a group willing to put in the time to prototype, we can help by creating a OTel SIG for this work (a repo plus a slack channel for discussion). But again, I'm concerned that the spec reviewers and language maintainers are fully committed, so there may not be a lot of bandwidth for review or assistance until we clear the deck. I hate saying "next year" but six months to complete metrics and the remaining current initiatives is probably realistic. If there are well thought out proposals and prototypes by then, it would definitely give this project a speed boost. :)

aalexand commented 3 years ago

We (owners of https://github.com/google/pprof repo) would be curious what it would take to standardize on the profile.proto as the wire format for profiling data in OTel.

@thegreystone RE "some data doesn't fit well into the model, so it's currently an archive with multiple files in it" - do you mind elaborating on that?

alolita commented 3 years ago

Is pprof being evaluated? It would be great to have a formal issue in the community repo. Ty!

MovieStoreGuy commented 3 years ago

Hey @alolita ,

Which community repo are you referring to?

ymotongpoo commented 3 years ago

@alolita do you mean this repository? https://github.com/open-telemetry/community

If yes, could you point out which SIGs or teams to chime for this topic?

jsuereth commented 3 years ago

Here's the donation process for contributing code.

If pprof itself will be contributed over time, then we need to do the long-form process (similar to other major technologies we've pulled in).
If the proposal (right now) is just the ability to send profiles in OTLP and correlate with other observability signals, perhaps just an OTEP is enough for the initial discussion.

No matter which process is in place, we should have a location where we collect documentation on:

Current state of the art for profiling (what technologies, outside of pprof, are used, across languages, etc.)
The need for profiling as an observability signal (you have some of this written down here)
What would be contributed to OpenTelemetry (AFAIK - this OTEP is just for the protocol)
Why OpenTelemetry is the best fit.

mhansen commented 2 years ago

Current state of the art for profiling (what technologies, outside of pprof, are used, across languages, etc.)

I think I can help with this. I've just researched the ecosystem of profilers, profile data formats, data format converters, and profile analysis UIs: https://www.markhansen.co.nz/profilerpedia/. I'm probably missing a few, but I think I've covered most of the main ones. I hope this can be a useful starting point for the standardisation process.

mhansen commented 2 years ago

FYI, I've now made a website for Profilerpedia (it's not just a Google Sheet any more): https://profilerpedia.markhansen.co.nz/, and the site renders directed graphs of profilers, their data formats, the transitive closure of data formats you can convert to, and UIs that can read those formats.

For example, the transitive set of profilers that are convertable to pprof (warning: huge graph, and some conversions are lossy): https://profilerpedia.markhansen.co.nz/formats/pprof/#converts-from-transitive

thomasdullien commented 2 years ago

For what it's worth: We've been running prodfiler's continuous profiling service for the last 15 months, and have collected extensive experience with the various footguns involved in collecting profiling data & how to make use of it. Would be more than happy to help share what we've learnt and what to watch out for or otherwise assist in the design process.

A few things to keep in mind:

1) Issues when pre-aggregating the data too much 2) Data volume / data efficiency

On (1): For a good user experience, it is often necessary for users to drill-down into fine-grained profiling event data; which means filtering profiling events by things like container, thread, and timeframes. This ends up creating problems when the data is pre-aggregated too early at too coarse granularity. The ideal format for the recipient is actual individual sampling events. This ideal format then needs to be balanced with other requirements.

(2) It's important to be careful about data volume. Given that the ideal format sends individual samples, and given that one wants to sample at anywhere between 20Hz and 200Hz per core, we are looking at 20 2^6 to 200 2^6 events events per second in the worst case on a 64-core server. This means that sending out full stack traces for each event quickly comes prohibitive: A java method name can easily have 32-64 characters, and a deep java stack can be 128+ frames.

So if we look at: 2^7 frames

2^6 characters
2^6 cores times
2^5-2^6 samples per second we are looking at 2^25 bytes per second (32 mb?) just for profiling data.

We ended up solving this by not transmitting full stack traces, and just hashes of traces, which reduces the amount of data dramatically.

Happy to help & provide more input!

jpkrohling commented 2 years ago

Tagging @brancz, which should have an opinion or two about this.

petethepig commented 2 years ago

At Pyroscope we've been building an open source continuous profiling platform for over a year now. We integrate with many different profilers from various languages and other open source projects in our agents:

Go: pprof
ruby: rbspy
python: py-spy
Java: async-profiler
eBPF: profile.py from bcc-tools
php: phpspy
.NET: dotnet trace
Rust: pprof-rs

Since we've had to deal with supporting all these different formats of profiles in order to store them, we are also looking forward to an agreed upon standardized format for profiles — especially as more tooling gets created to analyze and interact with profiles.

For example, we recently created an otelpyroscope package to link traces to profiles. Thanks to label support in pprof, this was really easy to implement.

On the other hand, some agents report profiling data in a format that doesn't support "labels" which makes an integration like this impossible since labels are needed to link profiles to other types of telemetry data.

Another example of where standardization would be useful is that to support Java profiles from async-profiler we had to write a JFR Parser in Go so that we can ingest profiles from async-profiler. Again, if all profilers were using (or at least supported) one output format it would have made this much easier.

All that being said, every profiler on this list also has its own quirks and nuances in output formats which make supporting them all overly complicated compared to if they supported the same standardized format.

Happy to help provide our thoughts and experience as we've gone through supporting many profiling formats across languages and projects and would love to help contribute to this effort.

jhalliday commented 2 years ago

With the metrics effort hitting release candidate stage (yay!) we're hopefully approaching a period when reviewers have a bit more time available. However, that only matters if there is something to review... I have some time available to discuss ideas/requirements and start prototyping on profiling support, mainly with a JVM focus. Anyone else available for contributing seed work, perhaps for go or another language? If we hit critical mass then requesting a new SIG probably makes sense, if not I'll just work in my own space for now.

mtwo commented 2 years ago

This is perfect timing! We discussed the project roadmap during the in-person community meeting at Kubecon last week, and profiling support was the second most popular topic, after logging (which is already in-flight)! The process that you mentioned (contributing seed work, writing requirements, forming a SIG) is what we used for logging, and I think that it makes sense to follow that here as well.

Do people want to discuss this on a call sometime next week? Any objections to 8:00 AM PT on Friday, June 3rd?

Rperry2174 commented 2 years ago

@mtwo I shared the steps you mentioned the other day in a slack channel here with a bunch of profiling developers where several have expressed interest

Anyway would love to chat Friday!

mtwo commented 2 years ago

I've created a meeting in the OpenTelemetry calendar for 8:00 AM PT this Friday for us to meet!

ahaw023 commented 2 years ago

would be good to include eBFP tools like Pixie

brancz commented 2 years ago

Parca, Pixie and prodfiler are all eBPF based and participating in this.

Rperry2174 commented 2 years ago

Hi all, as many of you know there has a been a working group of many people in this thread meeting to come up with a collective vision for profiling. A PR has been submitted detailing that vision and we'd love to get more feedback on it!

Please check it out and comment if you have any feedback or if you are generally in agreement we'd love to get more approvals from various community members who have expressed interest in this (even if you are not part of the OTel org)!

https://github.com/open-telemetry/oteps/pull/212

gillg commented 1 year ago

Is there some experiments and alpha tests around this subject? I would bring an initiative I've found in DotNet ecosystem https://github.com/dotnet/diagnostics/issues/2948#issuecomment-1426930139 the idea is to use pprof API as kind of standard. There is also a very new project (last comment) Wich allows to directly use grafana phlare to store profiles.

Rperry2174 commented 1 year ago

hi @gillg we are actively doing tests around this subject right now. You can follow the progress of our most recent benchmarks here, but yes we are definitely planning on something close to pprof.

The majority of the discussion is happening in the #otel-profiles channel in the cncf slack. Would love to have you hop in and give your thoughts there!

brunobat commented 8 months ago

I guess this proposal: https://github.com/open-telemetry/community/issues/1918 might fix this issue.

ayewo commented 8 months ago

@brunobat Came here to post the same thing.

trask commented 5 months ago

Closed by #239

open-telemetry / oteps

Proposal: Adding profiling as a support event type #139

Profiling events

How is this different from tracing

Suggestion