grafana / pyroscope

Continuous Profiling Platform. Debug performance issues down to a single line of code
https://grafana.com/oss/pyroscope/
GNU Affero General Public License v3.0
9.66k stars 577 forks source link

Memory allocation profiling in .NET #3295

Closed livegrenier closed 1 month ago

livegrenier commented 1 month ago

Hello,

I have been trying to evaluate pyroscope and as part of my POC I have been using the rideshare dot net example in this repo: https://github.com/grafana/pyroscope/tree/main/examples/language-sdk-instrumentation/dotnet/rideshare

Is it expected that I only get stats on CPU: 2024-05-13-15-49-25

I would like to see some memory stats to help identify a memory leak, I was under the impression that was supported for dot net based on what the documentation on the Grafana site shows: https://grafana.com/docs/pyroscope/latest/configure-client/language-sdks/dotnet/

Not sure if this is a bug or if I'm doing something wrong, please let me know.

Thanks.

bryanhuhta commented 1 month ago

👋 Can you share more about the issue you're facing? For example:

Without additional information, I would suggest making sure you've followed the steps outlined in the "Getting Started" guide. Additionally, make sure you have this environment variable set to enable memory profiling:

PYROSCOPE_PROFILING_ALLOCATION_ENABLED=true 

Here is a more complete list of the environment variable settings available to you.

livegrenier commented 1 month ago

Hi, Thanks for taking the time to reply, for right now I'm only using the rideshare example, I wanted to show something to our dev team before they start making changes to our apps, I do believe that the rideshare example has: PYROSCOPE_PROFILING_ALLOCATION_ENABLED=true properly set in the Dockerfile, that's why I'm a bit confused on why it is not working.

bryanhuhta commented 1 month ago

Ah gotcha. What will best hep you is process_cpu:alloc_size. While confusingly grouped under the CPU statistics, this will represent the number of bytes a function allocates. You can also utilize process_cpu:alloc_samples to catch functions that might be allocating a lot of times, but not allocating much memory. Unfortunately, our rideshare example doesn't do a great job of illustrating memory allocations, so the resulting flamegraph looks pretty barren. This is something we should address.

The rideshare example should look something like this:

Screenshot 2024-05-13 at 4 46 24 PM

You can see the bulk of the memory allocations are string manipulations of one sort or another.

Some languages support two additional types of memory profiles:

However, the .NET does not profiler does not have that support yet. We have a matrix that details which profile types each language supports: https://grafana.com/docs/pyroscope/latest/view-and-analyze-profile-data/profiling-types/#available-profiling-types.

bryanhuhta commented 1 month ago

You're likely evaluating just our OSS offering, but I will plug our Cloud offering which has a UI that gets more regular updates. Here we've made it a little clearer with the profile types in .NET.

Screenshot 2024-05-13 at 4 54 25 PM

Though, even here "alloc_size" should but under the "memory" grouping.

livegrenier commented 1 month ago

Oh great, thanks for explaining this to me, would you happen to know if in the case of a memory leak, would alloc_size help identify the problem?

bryanhuhta commented 1 month ago

It would absolutely be helpful in identifying most memory leaks. It's especially effective at memory leaks that result in OOM crashes. You would see the function that's allocating grow dramatically in width in the flame graph. To illustrate, I modified the rideshare OrderService class to allocate 1mb every time FindNearestVehicle is called:

private readonly List<int[]> _myLeakyList = new List<int[]>();

public void FindNearestVehicle(long searchRadius, string vehicle)
{
    // 1mb alloc
    this._myLeakyList.Add(new int[1 << 20]);

    lock (_lock)
    {
        var labels = Pyroscope.LabelSet.Empty.BuildUpon()
            .Add("vehicle", vehicle)
            .Build();
        Pyroscope.LabelsWrapper.Do(labels, () =>
        {
            for (long i = 0; i < searchRadius * 1000000000; i++)
            {
            }

            if (vehicle.Equals("car"))
            {
                CheckDriverAvailability(labels, searchRadius);
            }
        });
    }
}

And here is the corresponding flame graph for alloc_size. When comparing this against the unmodified alloc_size flame graph I showed before, you can see a significant growth in the width to FindNearestVehicle calls.

Screenshot 2024-05-13 at 5 19 21 PM

Now, if you have a more insidious leak where the memory being leaked isn't an ever-increasing amount but rather a critical resource that leaks once, this can be more difficult to view in a flame graph. In most applications, this is rarely the case, though.

Edit I'm technically allocating 8mb at a time, but you get the idea 😄

livegrenier commented 1 month ago

Great, thanks again for all the help, extremely appreciated