grafana / pyroscope

Continuous Profiling Platform. Debug performance issues down to a single line of code
https://grafana.com/oss/pyroscope/
GNU Affero General Public License v3.0
10.14k stars 614 forks source link

POC: export profile metrics at compaction time #3718

Open alsoba13 opened 3 days ago

alsoba13 commented 3 days ago

Prerequisites

Exporting profile metrics at compaction time

This PoC shows how could be export metrics from profiles at compaction time (in fact we do this right after compaction, not at compaction time).

Compaction is something that happens eventually in every block of our object storage. This approach offers some benefits over exporting at ingestion time, as described by tempo's members:

In theory, the first level of compaction (L0 blocks to L1 blocks) is done shortly after the data ingestion (~10s). But in practice, I've observed that L0 compaction happens every 30-120s. I don't know the reasons of such delay (maybe data ingestion is low and compaction happen less often? I only ingest data of 1 tenant with 2 services - every 15s aprox)

Generated metrics

Now that we have a prototype running, we can get a picture of how generated metrics look like.

Dimensions

Every profile type or dimension is exported as a metric with this format:

pyroscope_exported_metrics_<profile_type>{...}

So for example, if a service writes profile data of 3 different __profile_type__, we will export 3 different metrics:

Labels are preserved, unrolling new series for each labelset. So we can query for CPU of a specific pod of a service and some other pprof label like this:

pyroscope_exported_metrics_process_cpu_cpu_nanoseconds_cpu_nanoseconds{service_name="my-service", 
pod="my-pod", my_pprof_label="some-value"}

Dimensions metrics are exported for every tenant and every service_name, but this should be configurable by the user.

Functions

This prototype explores also the ability to export metrics on specific functions. We can chose an interesting function to export.

Now it's exporting data for every dimension of the given function under this format:

pyroscope_exported_metrics_functions_<profile_type>{function="exported-function", ...}

In this prototype I've hardcoded Garbage colector and HTTP functions to export, for every service_name. I haven't make distinction on tenant yet. The functions to export should come from config (UI is a must here).

In the future, we could specify a filter of LabelSets instead of exporting by service_name. So for example "foo": "{}" would export every profile of foo function. And "foo": "{service_name=\"my-service\", vehicle=\"bike\"}" would export only for that service_name and vehicle.

Detected challenges

This naive solution is full of trade-offs and assumptions and it's far from being a final solution. I've detected some challenges:

DEMO

I have a pyroscope with the changes running in my machine while exporting metrics to my grafana cloud instance.

Go grant yourself privileges in the admin page:

kolesnikovae commented 3 days ago

Regarding the compaction lag: it's totally possible that L0 compaction is delayed because we only compact data once we accumulated enough blocks. However, in the PR you mentioned, we added a parameter that controls for how long a block might be staged.

We also introduced an indicator – time to compaction. In our dev env, L0 compaction lag does not exceed 1m and p99 is around 15-20 seconds.

image

I think that relying on the "current" time might be dangerous – we could explore an option where we get timestamp of the blocks (the time they are created). Also, I think that OOO ingestion is almost inevitable: jobs might be run concurrently, and their order is not guaranteed (we don't need it for compaction); usually, this is not an issue, but if the job fails and we retry (which we do), we will likely violate the order. Fortunately, both Mimit and Prometheus handle OOO (with some caveats)