dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.28k stars 4.74k forks source link

XmlSerializer leaks native memory, leading to OOM. #95369

Open janvorli opened 11 months ago

janvorli commented 11 months ago

While investigating a customer's native memory leak that was causing regular OOMs in their web service, I've found that the culprit is in the XmlSerializer. Every instance of the XmlSerializer created using the constructor with XmlAttributesOverride argument causes native memory growth of about 40kB. So, when XmlSerializer is created this way per each request, it leads to OOM after some time, which can be quite soon in containerized environments with memory limits set. The problem is caused by the fact that every instance of XmlSerializer using the above-mentioned constructor creates a new dynamic assembly with AssemblyBuilderAccess.Run flag. Dynamic assemblies created this way stay in memory forever and can never be removed. So, all the native memory allocated for the assembly, like the in-memory image with the manifest, some hash tables and metadata, is never freed. The default constructor with only the type argument doesn't suffer from this issue since it caches the dynamic assembly in a static member.

It seems that creating the dynamic assemblies with AssemblyBuilderAccess.RunAndCollect flag instead would fix the problem, but there might be some consequences that I cannot foresee. Alternatively, enabling caching of the dynamic assemblies even for this form of a constructor would probably be good enough for most scenarios. I am no expert on the XmlSerializer implementation, but my guess is that the caching was not implemented for this case as it would require using the information from the XmlAttributesOverride argument as part of the cache key.

If the decision is to not to fix the issue for whatever reason, then it seems it would at least require to document this in the XmlSerializer constructor doc so that developers could create workarounds by caching the XmlSerializer on their side.   This issue occurs in .NET 6, 7 and 8.

ghost commented 11 months ago

Tagging subscribers to this area: @dotnet/area-system-xml See info in area-owners.md if you want to be subscribed.

Issue Details
While investigating a customer's native memory leak that was causing regular OOMs in their web service, I've found that the culprit is in the `XmlSerializer`. Every instance of the `XmlSerializer` created using the constructor with `XmlAttributesOverride` argument causes native memory growth of about 40kB. So, when `XmlSerializer` is created this way per each request, it leads to OOM after some time, which can be quite soon in containerized environments with memory limits set. The problem is caused by the fact that every instance of `XmlSerializer` using the above-mentioned constructor creates a new dynamic assembly with `AssemblyBuilderAccess.Run` flag. Dynamic assemblies created this way stay in memory forever and can never be removed. So, all the native memory allocated for the assembly, like the in-memory image with the manifest, some hash tables and metadata, is never freed. The default constructor with only the `type` argument doesn't suffer from this issue since it caches the dynamic assembly in a static member. It seems that creating the dynamic assemblies with `AssemblyBuilderAccess.RunAndCollect` flag instead would fix the problem, but there might be some consequences that I cannot foresee. Alternatively, enabling caching of the dynamic assemblies even for this form of a constructor would probably be good enough for most scenarios. I am no expert on the `XmlSerializer` implementation, but my guess is that the caching was not implemented for this case as it would require using the information from the XmlAttributesOverride argument as part of the cache key. If the decision is to not to fix the issue for whatever reason, then it seems it would at least require to document this in the `XmlSerializer` constructor doc so that developers could create workarounds by caching the `XmlSerializer` on their side.   This issue occurs in .NET 6, 7 and 8.
Author: janvorli
Assignees: -
Labels: `area-System.Xml`
Milestone: 8.0.x
janvorli commented 11 months ago

cc: @mayankkumar2

janvorli commented 11 months ago

Here is a small repro of the issue. After the loop, about 5GB of native memory is allocated.

using System.Xml.Serialization;

public class PurchaseOrder
{
    public string OrderDate;
}

public class Test
{
    public static void Main()
    {
        for (int i = 0; i < 100000; i++)
        {
            // The overrides argument is needed to trigger the memory leak
            XmlAttributeOverrides overrides = new XmlAttributeOverrides();
            XmlSerializer serializer = new XmlSerializer(typeof(PurchaseOrder), overrides);
            if (i % 100 == 0)
            {
                GC.Collect(2);
            }
        }

        Console.WriteLine("Done");
    }
}
Bykiev commented 11 months ago

It seems to be documented well here, isn't it?

gverbake commented 8 months ago

Also happens on the constructor with XmlRootAttribute overload: public XmlSerializer (Type type, System.Xml.Serialization.XmlRootAttribute? root);

https://learn.microsoft.com/en-us/dotnet/api/system.xml.serialization.xmlserializer.-ctor?view=net-8.0#system-xml-serialization-xmlserializer-ctor(system-type-system-xml-serialization-xmlrootattribute)

Also happens in 4.8

janvorli commented 8 months ago

Also happens on the constructor with XmlRootAttribute overload

Yes, it happens with all constructors except the XmlSerializer(Type) and XmlSerializer(Type, String).

It seems to be documented well here, isn't it?

It is very difficult to discover that comment (you need to go from the main page for the XmlSerializer to Supplemental API remarks for XmlSerializer that's linked in the middle of the page and then go the middle of a document. I believe it should be mentioned in the main doc right before the constructors, as it is a major gotcha for people based on my helping with many native memory leaks investigations during the past year. But it would be much better if the issue itself was fixed.