dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.91k stars 4.63k forks source link

Runtime doesn't utilize available Ram and crashes #102332

Open talweiss1982 opened 3 months ago

talweiss1982 commented 3 months ago

Description

Hi, I'm trying to load a huge amount of data on a single dotnet process using a machine with 4 tb of RAM ~128 cpu with ubuntu 22.04 using latest dotnet 8 sdk (updated today I think it's 8.0.5). The code reads 464 csv files each 3gb long it reads it line by line parsing each row and generating an object out of it. The object is then placed into one out of 5 concurrent dictionaries according to its type, each is initialized with a 500 million capacity which is enough so it won't need to increase. The dictionaries are there to remove duplicates and merge their data, the dedup objects are than written to csv files. I don't go over the 2 billion object limit of the dictionary. I don't read the entire file or keep any file in memory only their buffers which have the default size of 4kb and I don't process more than 10 files in parallel. Yet the machine never utilize more than 1.5 tb of memory. I investigated the Ubuntu configuration and it doesn't seem to have any limitations configured for a process. Moreover I tried running 4 of those processes and I didn't get above 1.8 tb. I checked the dotnet-counters and the gc heap size don't seem to grow above 300gb. I'm using server gc, tried setting heap count but it seems to be capped at 128 and I can only reduce it but setting it to a higher value is been ignored. I have configured 1 tb of big pages 2mb size and set up the dotnet flag to use huge page as I read they will be used when the gc got to its hard limit, as a result dotnet crashes with a heap initialization error I can't even build my project as the dotnet build also crashes with the same error. I tried running with the low latency for long duration mode of the gc but after processing 90% of the data the gc gen2 is invoked afterwards the process crashes. This means the runtime thinks it reached its limit although there are 2.5 free tb of ram. I also printed the gc available memory and it shows 4tb when the process starts. Tried asking the gc to retain vm memory, it didn't help. Please advise what voodoo I need to practice to make my dotnet process utilize it's God given RAM pool?

Reproduction Steps

Use a 22.04 Ubuntu machine with 4tb of ram with dotnet sdk 8.0.5 read 464 csv files each 3gb of size parse it line by line shove it into a concurrent dictionary, makes sure to split it into enough dictionaries so to not reach 2billion object in a single dictionary. Make sure to set its capacity in advence. Watch as the gc heap size doesn't utilize the whole ram and crashes

Expected behavior

I expect the gc heap to increase in size to utilize all the ram I don't mind gc pauses or slowness

Actual behavior

Gc heap size doesn't seem to grow above 300gb causing gc pressure and random crashes

Regression?

Didn't try on other version

Known Workarounds

I wish

Configuration

Dotnet 8.0.5 running on Ubuntu 22.04 with x64 architecture

Other information

There are two issues 1 gc heap count seem to be capped at 128 and I belive each heap size is capped so I can't get to 4tb... The other issues is that trying to use huge pages seems to crash the runtime

mangod9 commented 3 months ago

Thanks @talweiss1982 for reporting the issue. Do you have any specific GC config set for your application? Assume this isnt a regression from a previous release?

talweiss1982 commented 3 months ago

I have tried a few settings. Currently I'm trying to set heap count to 512 I have the gc server mode configured I have the no affiliate set I tried setting the maximum allowed memory to 3tb I have tried configuring it to low latency durible mode (cancel gen2 and loh compaction). I tried configuring hugepage mode it crashes the runtime. I have tried every combination of those settings the gc heap size don't grow causing oom.

On Thu, May 16, 2024, 21:36 Manish Godse @.***> wrote:

Thanks @talweiss1982 https://github.com/talweiss1982 for reporting the issue. Do you have any specific GC config set for your application? Assume this isnt a regression from a previous release?

— Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/102332#issuecomment-2115946836, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA6CVMQKEAFPSFME5HEHKTZCT4EDAVCNFSM6AAAAABH2W33DGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJVHE2DMOBTGY . You are receiving this because you were mentioned.Message ID: @.***>

talweiss1982 commented 3 months ago

According to dotnet-counters my heap fragmentation is below 1% so it's not fragmentation of the loh. I have a fixed amount of pinned object due to nlogn and dotnet configuration object I don't produce ones myself. I don't allocate unmanaged memory I don't pin object.

On Thu, May 16, 2024, 21:53 Tal Weiss @.***> wrote:

I have tried a few settings. Currently I'm trying to set heap count to 512 I have the gc server mode configured I have the no affiliate set I tried setting the maximum allowed memory to 3tb I have tried configuring it to low latency durible mode (cancel gen2 and loh compaction). I tried configuring hugepage mode it crashes the runtime. I have tried every combination of those settings the gc heap size don't grow causing oom.

On Thu, May 16, 2024, 21:36 Manish Godse @.***> wrote:

Thanks @talweiss1982 https://github.com/talweiss1982 for reporting the issue. Do you have any specific GC config set for your application? Assume this isnt a regression from a previous release?

— Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/102332#issuecomment-2115946836, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA6CVMQKEAFPSFME5HEHKTZCT4EDAVCNFSM6AAAAABH2W33DGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJVHE2DMOBTGY . You are receiving this because you were mentioned.Message ID: @.***>

mangod9 commented 3 months ago

can you try setting DOTNET_GCRegionRange=10700000000 env. var to check if that makes any difference?

mangod9 commented 3 months ago

that sets the initial GC reservation size to ~1TB. The config should be looking at the available physical memory to make that determination but wondering if there is something off with that logic.

cshung commented 3 months ago

Is it possible to do some live debugging together? A crash dump will likely be too big, and traces are unlikely to be useful for solving this crash.

neon-sunset commented 3 months ago

/offtopic I know this is unrelated but if you are not using Sep and NonBlocking.ConcurrentDictionary - you might want to try as they will reduce memory footprint per item (NonBlocking has different but not worse performance profile compared to out of box CDict but consumes less memory per entry).

NonBlocking.CDict: https://github.com/VSadov/NonBlocking Sep: https://github.com/nietras/Sep/

talweiss1982 commented 3 months ago

I'll try it on Sunday as my work days are Sunday through Thursday and will update the result.

On Thu, May 16, 2024, 22:19 Manish Godse @.***> wrote:

that sets the initial GC reservation size to ~1TB. The config should be looking at the available physical memory to make that determination but wondering if there is something off with that logic.

— Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/102332#issuecomment-2116015223, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA6CVLJ2FKQNKZSRW4QCETZCUBEHAVCNFSM6AAAAABH2W33DGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJWGAYTKMRSGM . You are receiving this because you were mentioned.Message ID: @.***>

talweiss1982 commented 3 months ago

Sure I belive we can setup time on Monday as I'm on GMT+3 time zone My work email is @.*** phone:+972-54-8024849

On Thu, May 16, 2024, 22:21 Andrew Au @.***> wrote:

Is it possible to do some live debugging together? A crash dump will likely be too big, and traces are unlikely to be useful for solving this crash.

— Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/102332#issuecomment-2116017742, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA6CVJRHRE4HXQS4OEVZLTZCUBKDAVCNFSM6AAAAABH2W33DGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJWGAYTONZUGI . You are receiving this because you were mentioned.Message ID: @.***>

talweiss1982 commented 3 months ago

I'll give it a try, was also thinking of using bloom filter and not putting object into the dictionary only duplicates and handle them separately this can avoid having to keep slot of memory but will introduce a 2nd phase for processing which my vp R&D doesn't like.

On Fri, May 17, 2024, 00:26 neon-sunset @.***> wrote:

/offtopic I know this is unrelated but if you are not using Sep and NonBlocking.ConcurrentDictionary - you might want to try as they will reduce memory footprint per item (NonBlocking has different but not worse performance profile compared to out of box CDict but consumes less memory per entry).

NonBlocking.CDict: https://github.com/VSadov/NonBlocking Sep: https://github.com/nietras/Sep/

— Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/102332#issuecomment-2116217067, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA6CVKLKTOX4WAB6EFUDW3ZCUP6XAVCNFSM6AAAAABH2W33DGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJWGIYTOMBWG4 . You are receiving this because you were mentioned.Message ID: @.***>

talweiss1982 commented 3 months ago

After setting DOTNET_GCRegionRange=10700000000 dotnet-counters: Press p to pause, r to resume, q to quit. Status: Running

Name Current Value [System.Runtime] % Time in GC since last GC (%) 15 Allocation Rate (B / 1 sec) 8,200 CPU Usage (%) 0.78 Exception Count (Count / 1 sec) 0 GC Committed Bytes (MB) 292,488.401 GC Fragmentation (%) 5.099 GC Heap Size (MB) 277,783.096 Gen 0 GC Budget (MB) 11,502 Gen 0 GC Count (Count / 1 sec) 0 Gen 0 Size (B) 0 Gen 1 GC Count (Count / 1 sec) 0 Gen 1 Size (B) 1.457e+10 Gen 2 GC Count (Count / 1 sec) 0 Gen 2 Size (B) 2.4346e+11 IL Bytes Jitted (B) 875,351 LOH Size (B) 2.8e+10 Monitor Lock Contention Count (Count / 1 sec) 0 Number of Active Timers 2 Number of Assemblies Loaded 64 Number of Methods Jitted 10,115 POH (Pinned Object Heap) Size (B) 167,360 ThreadPool Completed Work Item Count (Count / 1 sec) 0 ThreadPool Queue Length 0 ThreadPool Thread Count 12 Time paused by GC (ms / 1 sec) 0 Time spent in JIT (ms / 1 sec) 0 Working Set (MB) 288,691.38

The error: Fatal error. Failed to create RW mapping for RX memory

talweiss1982 commented 3 months ago

sdk is 8.0.105 (remembered something with 5)

ayende commented 3 months ago

I think that I know what the problem is. You are running into the vm.max_map_count limit. Try: sysctl -w vm.max_map_count=262144 or something like that (may need sudo) and then run your code.

Basically, what I think is happening is that you are using a lot of memory regions, not just memory. And there is a limit by default of 64K of those.

At some point, the JIT need to emit some code, or allocate more memory, and there aren't enough memory regions for that.

See also: https://ravendb.net/articles/production-postmortem-the-big-server-that-couldnt-handle-the-load https://ayende.com/blog/197635-A/production-postmortem-out-of-memory-on-a-clear-sky

talweiss1982 commented 3 months ago

sysctl -w vm.max_map_count=262144

Will try, thanks Oren.

talweiss1982 commented 3 months ago

@ayende it seems to have helped i'm able to pass 288 GB working set: Press p to pause, r to resume, q to quit. Status: Running

Name Current Value [System.Runtime] % Time in GC since last GC (%) 5 Allocation Rate (B / 1 sec) 5.0745e+09 CPU Usage (%) 16.122 Exception Count (Count / 1 sec) 0 GC Committed Bytes (MB) 337,193.353 GC Fragmentation (%) 7.262 GC Heap Size (MB) 308,255.841 Gen 0 GC Budget (MB) 4,529 Gen 0 GC Count (Count / 1 sec) 2 Gen 0 Size (B) 0 Gen 1 GC Count (Count / 1 sec) 0 Gen 1 Size (B) 1.7257e+08 Gen 2 GC Count (Count / 1 sec) 0 Gen 2 Size (B) 3.0788e+11 IL Bytes Jitted (B) 818,897 LOH Size (B) 2.4e+10 Monitor Lock Contention Count (Count / 1 sec) 0 Number of Active Timers 2 Number of Assemblies Loaded 67 Number of Methods Jitted 9,668 POH (Pinned Object Heap) Size (B) 167,360 ThreadPool Completed Work Item Count (Count / 1 sec) 2 ThreadPool Queue Length 0 ThreadPool Thread Count 12 Time paused by GC (ms / 1 sec) 82.403 Time spent in JIT (ms / 1 sec) 0 Working Set (MB) 336,538.276

ayende commented 3 months ago

That being said, I would say that this is a runtime bug, in the error reportin

talweiss1982 commented 3 months ago

can you try setting DOTNET_GCRegionRange=10700000000 env. var to check if that makes any difference?

now it crashed at 1.164 TB, what should be the value if i want to utilize 3 TB?

mangod9 commented 3 months ago

The config setting is the hex value in bytes. So you should be able to change it accordingly.

talweiss1982 commented 3 months ago

Thanks, thought it might be a bit flag.

On Mon, May 20, 2024, 05:01 Manish Godse @.***> wrote:

The config setting is the hex value in bytes. So you should be able to change it accordingly.

— Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/102332#issuecomment-2119536824, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA6CVL7ORAVKEKGSCDPJHDZDFKQXAVCNFSM6AAAAABH2W33DGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGUZTMOBSGQ . You are receiving this because you were mentioned.Message ID: @.***>

janvorli commented 3 months ago

That being said, I would say that this is a runtime bug, in the error reportin

Do you mean that the error message is not pointing to the source of the problem (which I am going to fix) or something else?

ayende commented 3 months ago

Yes, it should give a clearer error message.

talweiss1982 commented 3 months ago

i'm actually still getting OOM using this settings: DOTNET_GCRegionRange=32000000000 sysctl -w vm.max_map_count=524240

I didn't get to 3TB, i think its the GC allocating large amount of unmanaged memory, when trying to profile localy on a single file i saw the GC allocated 400MB of native memory and the number was growing rapidly my numbers before the crash are:

Name Current Value [System.Runtime] % Time in GC since last GC (%) 25 Allocation Rate (B / 1 sec) 8,200 CPU Usage (%) 17.025 Exception Count (Count / 1 sec) 0 GC Committed Bytes (MB) 2,278,842.536 GC Fragmentation (%) 1.611 GC Heap Size (MB) 2,242,855.335 Gen 0 GC Budget (MB) 26,843 Gen 0 GC Count (Count / 1 sec) 0 Gen 0 Size (B) 0 Gen 1 GC Count (Count / 1 sec) 0 Gen 1 Size (B) 4.1449e+09 Gen 2 GC Count (Count / 1 sec) 0 Gen 2 Size (B) 2.1622e+12 IL Bytes Jitted (B) 961,600 LOH Size (B) 8.918e+10 Monitor Lock Contention Count (Count / 1 sec) 0 Number of Active Timers 2 Number of Assemblies Loaded 67 Number of Methods Jitted 10,877 POH (Pinned Object Heap) Size (B) 130,680 ThreadPool Completed Work Item Count (Count / 1 sec) 0 ThreadPool Queue Length 0 ThreadPool Thread Count 12 Time paused by GC (ms / 1 sec) 0 Time spent in JIT (ms / 1 sec) 0 Working Set (MB) 2,318,343.016

janvorli commented 3 months ago

@talweiss1982 is it still crashing with the "Failed to create RW mapping for RX memory." or in some other way? maybe 524240 is still too low for your case.

talweiss1982 commented 3 months ago

crashes with Out of memory, i think that the GC is allocating quite a bit of native memory, how can i monitor the amount of memory the GC itself is allocating ?

cshung commented 3 months ago

You can use the CommittedUsage event here. The event will give you a breakdown on the large chunks of native memory the GC acquired from the OS (which may or may not be used by managed objects yet).

If you collect a GCCollectOnly trace, you should see this show up on the events view of PerfView.

If you get the OOM trapped in a debugger, you can also use the MAddress command to see a breakdown of all the native memory used.

janvorli commented 3 months ago

@cshung, I assume the CommittedUsage doesn't include the card tables and similar GC bookkeeping stuff, right? I guess those might be quite large for such a large heap size.

cshung commented 3 months ago

@cshung, I assume the CommittedUsage doesn't include the card tables and similar GC bookkeeping stuff, right? I guess those might be quite large for such a large heap size.

It does - the last field is total_bookkeeping_committed, which include everything listed here.

talweiss1982 commented 3 months ago

This time the OOM was due to me passing the max capacity of the dictionary, i think I'm good, was able to load 2.4 TB of data.

mangod9 commented 3 months ago

Ok good to know that you have been able to get things to work. Can you please confirm the config settings you had to apply to get ~3TB of working set? We will need to look into why the GC isnt defaulting to the higher region range. Thx!

cshung commented 3 months ago

@mangod9 is right, you shouldn't have to set the DOTNET_GCRegionRange manually.

Can we take a look at what happened around here - this code should be run exactly once during process startup.

https://github.com/dotnet/runtime/blob/362a95d53d7266ee9b7452e351d07845dcc1edd2/src/coreclr/gc/gc.cpp#L48493-L48517

It should be set automatically by the formula there. It must be the case something is wrong there if a manual configuration helped with the situation.

cshung commented 1 month ago

@talweiss1982, it has been a long while, is this issue ready to be closed?

mangod9 commented 1 month ago

Believe we need to investigate why on machines with large available memory, setting the RegionRange is required.

mangod9 commented 3 weeks ago

@talweiss1982, we are trying to investigate why you need to set an extra config. Can you please provide how you setup hugepages like you have described in your original post.

dotnet-policy-service[bot] commented 3 weeks ago

This issue has been marked needs-author-action and may be missing some important information.

dotnet-policy-service[bot] commented 2 days ago

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.