dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.22k stars 1.35k forks source link

Memory leaks when unloading Projects #460

Closed rainersigwald closed 2 years ago

rainersigwald commented 8 years ago

There is an ancient thread about this at http://social.msdn.microsoft.com/Forums/vstudio/en-US/36381eaf-4118-4f10-9101-69021619d3ba/memory-leak-in-solutionloadstandaloneproject. As far as I know this hasn't been fixed.

According to another internal report from long ago,

You can confirm this by creating a simple project that just loads and disposes projects; it will eat up all of your memory.

This seems likely to be related to memory-usage complaints from another internal team that has a similar load-a-bunch-of-projects workflow.

Sarabeth-Jaffe-Microsoft commented 8 years ago

@heXelium did some good analysis on this issue in #474, which presents a couple of options.

OniBait commented 7 years ago

I'd love to make a PR to fix this issue but seeing as how #474 was closed because there was no clear direction on the fix I'm a bit hesitant to. There's several ways I can think of to fix it: 1) Use String.Intern() instead of ProjectStringCache -- the projects aren't getting unloaded properly anyway, so the reference counting going on doesn't happen. 2) Don't try and pool the strings at all. They should already be getting pooled in an XmlNameTable already, so pooling them again seems like a waste. 3) Override the equality/hashcode members on XmlDocumentWithLocation so that the reference cache works properly. 4) Use a different key for ProjectStringCache._documents (the document location, the hashcode of the object, something...) 5) Use something like aWeakReference<>` as the key and a custom key comparer for the dictionary so that the XmlDocument isn't prevented from being GC'd.

This bug in combination with the GC.Collect inside the loop in BuildRequestEngine.CheckMemoryUsage can lead to really horrible performance. A key example would be switching a branch in git and clicking "Reload All Projects" in Visual Studio -- the memory pressure goes up due to the projects not getting unloaded and then you start getting huge > 90 second pauses in Visual Studio because it is trying to make room for more memory (glad to see that https://github.com/Microsoft/msbuild/pull/1351 made it into the VS15-rtw branch)

jeromelaban commented 2 years ago

I've hit this issue when dealing with Roslyn's MSBuildWorkspace (repeatedly creating workspaces in the same process) and here's how I'm currently dealing with it:

if (typeof(Microsoft.Build.Construction.ElementLocation).Assembly.GetType("Microsoft.Build.Construction.ProjectStringCache") is { } stringCacheType)
{
    if (stringCacheType.GetField("_documents", BindingFlags.Instance | BindingFlags.NonPublic) is { } documentsField)
    {
        if (typeof(Microsoft.Build.Construction.ElementLocation).Assembly.GetType("Microsoft.Build.Construction.XmlDocumentWithLocation") is { } xmlDocumentWithLocationType)
        {
            if (xmlDocumentWithLocationType.GetField("s_globalStringCache", BindingFlags.Static | BindingFlags.NonPublic)?.GetValue(null) is { } globalStringCache)
            {
                if (documentsField.GetValue(globalStringCache) is { } documents)
                {
                    if (documents.GetType().GetMethod("Clear") is { } clearMethod)
                    {
                        clearMethod.Invoke(documents, null);
                    }
                }
            }
        }
    }
}

This is extremely ugly, to say the least, but it at least keeps the memory consumption to acceptable levels.

rokonec commented 2 years ago

I am assigning this to myself, as this seems to be related to memory leak I have fixed for MSBuild server. There is slight chance it has been fixed by this. I will try to repro it after we merge MSBuild server.

rokonec commented 2 years ago

The problem is indeed in s_globalStringCache. It caches strings of XML documents but the key is the XMLDocument object, so each time same XML file is parsed new dictionary pair <XmlDocument, HashSet<StringCacheEntry> is created and filled with all used StringCacheEntry(s). It effectively root all XmlDocuments, as keys, which includes lots of memory and is causing significant memory leak.

This is tightly coupled with ProjectRootElementCache which maintains s_globalStringCache when projects load or reload.

Unfortunately neither ProjectCollection nor ProjectRootElementCache is automatically cleared as it is expected that persistent process like VS will maintain lifetime of Projects and its caches itself.

All above is very complex and entangled and I believe this should be replaced by our StringTools interning and carefully measured and analyzed.

Recently I have created new ProjectCollections constructor https://github.com/dotnet/msbuild/blob/827c1bf9c4f400f3aea448bbe7048ab12af0089a/src/Build/Definition/ProjectCollection.cs#L303 with argument reuseProjectRootElementCache. When true if will use reuse static instance of ProjectRootElementCache and thus can eliminates this memory leak providing it creates Project the way which leverages ProjectRootElementCache. However, because shared ProjectRootElementCache is supposed to have longer lifetime than builds it has to autoreload project XMLs which has small performance price of testing XMLs timestamps.

@jeromelaban is this constructor something you can use in Roslyn?

baronfel commented 2 years ago

@rokonec should the GlobalProjectCollection use this constructor? I'm unsure of the guidance for using that collection.

rokonec commented 2 years ago

@baronfel I though about it, but then realized there is no need for it. My hypothesis is that people mostly use either either GlobalProjectCollection or new ProjectColletion() approach. If they use GlobalProjectCollection it servers as singleton and uses dedicated ProjectRootElementCache effectively making ProjectRootElementCache also singleton.

jeromelaban commented 2 years ago

@jeromelaban is this constructor something you can use in Roslyn?

I don't think I'll be able to use this specific constructor yet as, it would require to bump msbuild to a very recent version that would make some older SDKs fail because of backward compatibility constraints.

The poke through hacks I put in place to clear the caches make my scenario viable for now.

Thanks for looking into it!

MichalPavlik commented 2 years ago

We closed this issue and created new ticket with changes described by @rokonec. Feel free to reopen it.