microsoft / dotnet

This repo is the official home of .NET on GitHub. It's a great starting point to find many .NET OSS projects from Microsoft and the community, including many that are part of the .NET Foundation.
https://devblogs.microsoft.com/dotnet/
MIT License
14.37k stars 2.22k forks source link

32bit .NET memory. 4.8 vs 4.5-4.7 #1101

Open Peperud opened 5 years ago

Peperud commented 5 years ago

Setup:

  1. Two identical Windows 2016 server VMs with 8GB RAM
  2. Both with latest Windows Updates applied
  3. One has .NET 4.7.2, the other 4.8
  4. Same test app, with the same configuration, executed with the same allocation pattern.
  5. The background load on the servers looked comparable.

Results (give or take):

  1. On the .NET 4.8 VM, the test app could grow to about 3.7 GB (before throwing OOM exception)
  2. On the .NET 4.7.2 VM that number was only 2.76 GB

BTW - the same ~2.7 GB limit was observed on several Windows 2008 R2 servers running .NET 4.5.2 too.

This is almost 1 GB difference, which is huge, given the 32 bit cap to start with!

Is there's something that's (so much) improved in .NET 4.8 runtime itself to account for that or when installed 4.8 changed some settings and that's the only reason? Settings that would've otherwise made .NET 4.5-4.7 to claim that almost 1 GB too.

Peperud commented 5 years ago

Switching to server GC (with or without concurrent collection enabled) results in the application not throwing OOM, but not completing either. Memory usage stops at some point, the CPU is high and stays high. The application just hangs there. This behavior is the same for .NET 4.8 & 4.7. My guess is the GC kicks in, suspends the main thread, gets in some kind of loop and never gets out of it.

Alois-xx commented 5 years ago

You can check the virtual address space with VMMap. It looks like some large memory holes are not present there. When you know if it is the loaded dlls or the managed heap you can attribute it either to assembly (not) loading or to improvements how the GC deals with pinned objects and memory fragmentation. Normally you will start with Windbg and SOS and !EEHeap -gc the analysis to see how many managed heaps you have and how large they are. Then you need to look into your allocation pattern and sizes and where pinned memory resides.

Peperud commented 5 years ago

Microsoft support acknowledged this to be a bug. Will keep the issue open to update on the resolution when available.

MvRoo commented 4 years ago

Hi @Peperud do you have any link to the bug you mentioned and/or do you know if the bug is being fixed by Microsoft? We're seeing comparable behavior in our processes after an upgrade from .NET 4.5.2 to 4.8.

coderb commented 4 years ago

upgraded to 4.8 and transient memory usage out of control and winds up causing other processes on the machine to be killed. this is on x64. no bueno.

coderb commented 4 years ago

i've tracked down the memory issue i experienced. my application was loading static data of around 10 million strings into a dictionary. the dictionary object was 1-2GB in the LOH and the strings sat in Gen2. After some heavy activity the heap experienced massive fragmentation and my memory foot print ballooned to 16GB when the actual memory in use was only a few gigs.

this does not happen under .net47 and below.

Alois-xx commented 4 years ago

@coderb: Do you have a repro? Are you having pinned objects around? Or would a

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();  

solve your issue?

coderb commented 4 years ago

repro is going to be hard. it's a large production app. i've already taken steps to use custom memory handling to eliminate the massive amount of objects into byte array tables. my guess is this: my machine has 32GB memory with swap setup as "system managed" on a spinning disk with say 1-10mb/sec write speed. for some odd reason when the dotnet process decides to grow its commit size to 16GB windows decides it wants to start killing other processes rather than swapping. swap only grows to around 6GB with plenty of disk space and no system limit setup.

so, not really sure what's different. i used to run just fine under 4.7 and decided to install the 4.8 targeting pack in visual studio. i was unaware that this would in-place upgrade my runtime to 4.8 which seems to have a much different garbage collector than 4.7. also it seems there is no option to downgrade the runtime to 4.7.

i think that i'm good now with some very tricky code changes to avoid having lots of objects. however, i would recommend that you try to create some stress test cases that load 10-50 million strings into a dictionary and then try to do operations that generate lots of garbage. the runtime really should be able to handle this scenario in a way that seems more graceful than what i've observed. hope this helps.

Alois-xx commented 4 years ago

Windows will not kill your processes. It is not Linux with an OOM Killer feature. The reason why your application memory is never swapped out is because the GC keeps your working set with each full GC alive and prevents swapping almost entirely for an allocation heavy .NET application. It could be that you suffer from increased fragmentation, but without a repro it is hard to tell what is different.