dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

Managed Memory Leak on NET for ManagementClass #89450

Open wzm199309 opened 1 year ago

wzm199309 commented 1 year ago

Description

We are migrating our windows service project from net fx to NET6 and now we have observed a memory leak issue for Management Class which constantly contributed to memory leak (3Gb within 2 days), and issue can be reproduced in a vanilla NET project.

using (ManagementClass volumeClass = new ManagementClass([\\\\.\\Root\\Microsoft\\Windows\\Storage](file://./Root/Microsoft/Windows/Storage) + ":MSFT_Volume"))
                {
                    if (volumeClass != null)
                    {
                        using (ManagementObjectCollection volumes = volumeClass.GetInstances())
                        {
                            if (volumes != null)
                            {
                                foreach (ManagementObject volume in volumes)
                                {
                                     // volumeitems.Add(volume);
                                }
                            }
                        }
                    }
                }

Configuration

Regression?

Yes, issue can not be reproduced in WS2016 for net fx vanilla app.

Data

Analysis

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-management See info in area-owners.md if you want to be subscribed.

Issue Details
### Description We are migrating our windows service project from net fx to NET6 and now we have observed a memory leak issue for Management Class which constantly contributed to memory leak (3Gb within 2 days), and issue can be reproduced in a vanilla NET project. ``` using (ManagementClass volumeClass = new ManagementClass([\\\\.\\Root\\Microsoft\\Windows\\Storage](file://./Root/Microsoft/Windows/Storage) + ":MSFT_Volume")) { if (volumeClass != null) { using (ManagementObjectCollection volumes = volumeClass.GetInstances()) { if (volumes != null) { foreach (ManagementObject volume in volumes) { // volumeitems.Add(volume); } } } } } ``` ### Configuration * NET6 * Windows Server 2016 * X64 * If relevant, what are the specs of the machine? ### Regression? Yes, issue can not be reproduced in WS2016 for net fx vanilla app. ### Data ### Analysis
Author: wzm199309
Assignees: -
Labels: `area-System.Management`, `tenet-performance`, `untriaged`
Milestone: -
danmoseley commented 1 year ago

Can you use the Visual Studio memory profiler to see what is leaking and what is rooting it?

Clockwork-Muse commented 1 year ago

// volumeitems.Add(volume)

volumeitems isn't in your code listing, and is potentially implied to be outside the entire scope. Are you doing something like getting the management data in a loop? Also, the current code sample is implying volume info is outliving what is the probable owner.

Besides the request for a profiling run, we'd also like to see a more developed repro/larger sample of code, because there may be a more local explanation for the leak.

wzm199309 commented 1 year ago

@Clockwork-Muse as simple as it can repro, any new worker service app who call this code can bump memory without a minute:

using Microsoft.Extensions.Logging;
using System.Management;

namespace MemoryLeakApp
{
    public class Worker : BackgroundService
    {
        private readonly ILogger<Worker> _logger;

        public Worker(ILogger<Worker> logger)
        {
            _logger = logger;
        }

        protected override async Task ExecuteAsync(CancellationToken stoppingToken)
        {
            while (!stoppingToken.IsCancellationRequested)
            {
                GetAllDriveStatus();
                _logger.LogInformation("Worker running at: {time}", DateTimeOffset.Now);
                await Task.Delay(1000, stoppingToken);
            }
        }

        /// <summary>
        /// Return all drives status as stated in <see cref="SQLDriveStatus"/>
        /// </summary>
        /// <returns>list of drive status</returns>
        public static void GetAllDriveStatus()
        {
            // List<ManagementObject> volumeitems = new List<ManagementObject>();

            try
            {
                using (ManagementClass volumeClass = new ManagementClass("\\\\.\\Root\\Microsoft\\Windows\\Storage" + ":MSFT_Volume"))
                {
                    if (volumeClass != null)
                    {
                        using (ManagementObjectCollection volumes = volumeClass.GetInstances())
                        {
                            if (volumes != null)
                            {
                                foreach (ManagementObject volume in volumes)
                                {
                                }
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                // swallow the exception as status will updated every 1hr
            }

            // filter out invalid volumes which do not have drive letter
            // but for cluster env found status, it is needed for UX to consume
        }
    }
}
Clockwork-Muse commented 1 year ago

Without profile info, it's impossible to know whether it's an actual leak, or whether it's just growth allowed because it hasn't hit process limits (ie, GC just hasn't run yet).

That aside, I'm still recommending you transform and filter during the gather, not afterwards, so that you don't have instances outliving their potential owner.

wzm199309 commented 1 year ago

@carlossanlop @Clockwork-Muse
image

gc-dump show unrecycled objects, I can share gc-dump file if anywhere I can upload it.

Clockwork-Muse commented 1 year ago

ManagementObject implements Dispose(). Your code sample is leaking those, assuming iterating through them that way makes you the "owner". If not, they shouldn't outlive their container instance.

wzm199309 commented 1 year ago

@Clockwork-Muse I didnt get it, what should we do in this code sample to avoid leaking ?

danmoseley commented 1 year ago

@wzm199309
to investigate this we'd need a stand-alone minimal repro program.

Also I recommend you profile it yourself to see what is actually leaking and why and share the result.

wzm199309 commented 1 year ago

@danmoseley the code I pasted is a vanilla repro, it will only take 30 seconds for you to set up and repro.

ericstj commented 2 months ago

triage: not a regression from previous .NET version but still an important issue to investigate.