Closed rainersigwald closed 2 years ago
@heXelium did some good analysis on this issue in #474, which presents a couple of options.
I'd love to make a PR to fix this issue but seeing as how #474 was closed because there was no clear direction on the fix I'm a bit hesitant to. There's several ways I can think of to fix it:
1) Use String.Intern()
instead of ProjectStringCache
-- the projects aren't getting unloaded properly anyway, so the reference counting going on doesn't happen.
2) Don't try and pool the strings at all. They should already be getting pooled in an XmlNameTable
already, so pooling them again seems like a waste.
3) Override the equality/hashcode members on XmlDocumentWithLocation
so that the reference cache works properly.
4) Use a different key for ProjectStringCache._documents (the document location, the hashcode of the object, something...) 5) Use something like a
WeakReference<>` as the key and a custom key comparer for the dictionary so that the XmlDocument isn't prevented from being GC'd.
This bug in combination with the GC.Collect
inside the loop in BuildRequestEngine.CheckMemoryUsage
can lead to really horrible performance. A key example would be switching a branch in git and clicking "Reload All Projects" in Visual Studio -- the memory pressure goes up due to the projects not getting unloaded and then you start getting huge > 90 second pauses in Visual Studio because it is trying to make room for more memory (glad to see that https://github.com/Microsoft/msbuild/pull/1351 made it into the VS15-rtw branch)
I've hit this issue when dealing with Roslyn's MSBuildWorkspace
(repeatedly creating workspaces in the same process) and here's how I'm currently dealing with it:
if (typeof(Microsoft.Build.Construction.ElementLocation).Assembly.GetType("Microsoft.Build.Construction.ProjectStringCache") is { } stringCacheType)
{
if (stringCacheType.GetField("_documents", BindingFlags.Instance | BindingFlags.NonPublic) is { } documentsField)
{
if (typeof(Microsoft.Build.Construction.ElementLocation).Assembly.GetType("Microsoft.Build.Construction.XmlDocumentWithLocation") is { } xmlDocumentWithLocationType)
{
if (xmlDocumentWithLocationType.GetField("s_globalStringCache", BindingFlags.Static | BindingFlags.NonPublic)?.GetValue(null) is { } globalStringCache)
{
if (documentsField.GetValue(globalStringCache) is { } documents)
{
if (documents.GetType().GetMethod("Clear") is { } clearMethod)
{
clearMethod.Invoke(documents, null);
}
}
}
}
}
}
This is extremely ugly, to say the least, but it at least keeps the memory consumption to acceptable levels.
I am assigning this to myself, as this seems to be related to memory leak I have fixed for MSBuild server. There is slight chance it has been fixed by this. I will try to repro it after we merge MSBuild server.
The problem is indeed in s_globalStringCache
. It caches strings of XML documents but the key is the XMLDocument object, so each time same XML file is parsed new dictionary pair <XmlDocument, HashSet<StringCacheEntry>
is created and filled with all used StringCacheEntry
(s). It effectively root all XmlDocuments, as keys, which includes lots of memory and is causing significant memory leak.
This is tightly coupled with ProjectRootElementCache
which maintains s_globalStringCache
when projects load or reload.
Unfortunately neither ProjectCollection
nor ProjectRootElementCache
is automatically cleared as it is expected that persistent process like VS will maintain lifetime of Projects and its caches itself.
All above is very complex and entangled and I believe this should be replaced by our StringTools interning and carefully measured and analyzed.
Recently I have created new ProjectCollections
constructor https://github.com/dotnet/msbuild/blob/827c1bf9c4f400f3aea448bbe7048ab12af0089a/src/Build/Definition/ProjectCollection.cs#L303
with argument reuseProjectRootElementCache
.
When true
if will use reuse static instance of ProjectRootElementCache
and thus can eliminates this memory leak providing it creates Project
the way which leverages ProjectRootElementCache
.
However, because shared ProjectRootElementCache
is supposed to have longer lifetime than builds it has to autoreload project XMLs which has small performance price of testing XMLs timestamps.
@jeromelaban is this constructor something you can use in Roslyn?
@rokonec should the GlobalProjectCollection use this constructor? I'm unsure of the guidance for using that collection.
@baronfel I though about it, but then realized there is no need for it. My hypothesis is that people mostly use either either GlobalProjectCollection
or new ProjectColletion()
approach. If they use GlobalProjectCollection
it servers as singleton and uses dedicated ProjectRootElementCache
effectively making ProjectRootElementCache
also singleton.
@jeromelaban is this constructor something you can use in Roslyn?
I don't think I'll be able to use this specific constructor yet as, it would require to bump msbuild to a very recent version that would make some older SDKs fail because of backward compatibility constraints.
The poke through hacks I put in place to clear the caches make my scenario viable for now.
Thanks for looking into it!
We closed this issue and created new ticket with changes described by @rokonec. Feel free to reopen it.
There is an ancient thread about this at http://social.msdn.microsoft.com/Forums/vstudio/en-US/36381eaf-4118-4f10-9101-69021619d3ba/memory-leak-in-solutionloadstandaloneproject. As far as I know this hasn't been fixed.
According to another internal report from long ago,
This seems likely to be related to memory-usage complaints from another internal team that has a similar load-a-bunch-of-projects workflow.