dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.33k stars 4.74k forks source link

[8.0] OOM occurs when calling Encoding.UTF8.GetString with a modified LOH threshold #108010

Open nininig opened 1 month ago

nininig commented 1 month ago

Description

In the application I’m developing, byte arrays of around 2 to 4 MB are being converted to strings using Encoding.UTF8.GetString. After upgrading from .Net6 to .Net8, an OOM exception started occurring. Previously, the application encountered performance degradation when frequently allocating objects of 2 to 4 MB, as LOH usage increased, and Gen2 GC was triggered. To mitigate this, the LOH threshold was raised to 6 MB by setting the following environment variable: DOTNET_GCLOHThreshold="0x600000"

After tracing the issue, it seems that the OOM exception occurs when calling string.FastAllocateString.

Is there a way to avoid the OOM exception in .Net8 while keeping the LOH threshold setting? Any insights or suggestions would be greatly appreciated!

Thank you in advance for your help.

Reproduction Steps

Running the following sample code with the LOH threshold set to 6 MB reproduces the issue at the Encoding.UTF8.GetString step. DOTNET_GCLOHThreshold="0x600000"

using System.Text;

byte[] bytes = Enumerable.Repeat<byte>(Convert.ToByte('A'), 2 * 1024 * 1024).ToArray();
Console.WriteLine($"Allocate: {bytes.Length}");

string str = Encoding.UTF8.GetString(bytes);
Console.WriteLine($"ToString: {str.Length}");

Expected behavior

No OOM exception occurs under the following conditions:

Actual behavior

An OOM exception occurs under the following conditions:

Regression?

It was working without any issues in .Net6.

Known Workarounds

This issue can be mitigated by the following:

Configuration

Other information

No response

huoyaoyuan commented 1 month ago

Duplicate of #95219. LOH threshold can't exceed SOH segment size, which is 4MB by default.

In the application I’m developing, byte arrays of around 2 to 4 MB are being converted to strings using Encoding.UTF8.GetString.

Are you forced to use string? You can consider to use char[] from array pool, and access the string content as ReadOnlySpan<char>.

nininig commented 1 month ago

Thank you for your comment. I now understand that the following constraint was introduced with the GC region feature in .NET 7:

LOH threshold can't exceed SOH segment size

The reason I have to use strings is that the converted string is passed across multiple classes, and although I would like to improve this, the changes required are too extensive to address at the moment. (I agree that using array pools or reducing object size would be a smarter approach, though...)

After further investigation, I found the following environment variable that allows adjusting the GC region size. By setting it to 8MB, I found that the OOM issue no longer occurs, even when the LOH threshold is set to 6MB: DOTNET_GCRegionSize=0x800000 I understand that increasing the GC region size might reduce memory usage efficiency, but if there are any other critical concerns, I would appreciate it if you could let me know.

markples commented 1 month ago

Hi @nininig, thank you for the report. Increasing the region size means that the GC has a bit less flexibility in how it manages memory at the large scale, but this is not automatically a critical concern.

Increasing the LOH threshold is giving you generational behavior for larger objects, and of course specifically allowing you to collect these 2MB and 4MB objects without performing gen2 collections. This could have performance implications because large objects are going through all of the SOH mechanisms. It seems worth the tradeoff for you right now because you have so many temporary large objects.

Note that the workaround of using .NET 9 (with large LOH threshold but default region size) is probably limiting the LOH threshold, so it is avoiding the OOM but probably not getting the behavior that you want. You'd still want to set the threshold.

As a side note, the symptom of OOM under heavy LOH allocation is something that we also see due to a separate issue that we have with LOH region management. It sounds like the OOM here is entirely due to the settings and -not- related to that.