Closed adamashton closed 8 months ago
Tagging subscribers to this area: @dotnet/gc See info in area-owners.md if you want to be subscribed.
Author: | adamashton |
---|---|
Assignees: | - |
Labels: | `tenet-performance`, `area-GC-coreclr` |
Milestone: | - |
cc @javiercn
The dump being large could in part be https://github.com/dotnet/runtime/issues/71472. I don't expect it to be overly dramatic, but it does increase the size by tickling pages. Res seems to match up what dotnet-counters says - your ram resident set is ~ 6 GB. About the same size of the dump too. 100mb of which is the reserved memory, so there's memory somewhere outside of the GC. What's the memory dotnet-counters reports before the dump is collected?
I performed these commands after a server restart, after simulation of some users using the app but before any dotnet dumps were collected.
root@c59603927169:~# dotnet-counters monitor --refresh-interval 10 --process-id 72
[System.Runtime]
% Time in GC since last GC (%) 0
Allocation Rate (B / 10 sec) 206,992
CPU Usage (%) 0
Exception Count (Count / 10 sec) 0
GC Committed Bytes (MB) 118
GC Fragmentation (%) 3.086
GC Heap Size (MB) 96
Gen 0 GC Count (Count / 10 sec) 0
Gen 0 Size (B) 24
Gen 1 GC Count (Count / 10 sec) 0
Gen 1 Size (B) 660,216
Gen 2 GC Count (Count / 10 sec) 0
Gen 2 Size (B) 84,586,528
IL Bytes Jitted (B) 3,603,218
LOH Size (B) 7,242,400
Monitor Lock Contention Count (Count / 10 sec) 0
Number of Active Timers 14
Number of Assemblies Loaded 1,205
Number of Methods Jitted 51,118
POH (Pinned Object Heap) Size (B) 579,976
ThreadPool Completed Work Item Count (Count / 10 sec) 10
ThreadPool Queue Length 0
ThreadPool Thread Count 5
Time spent in JIT (ms / 10 sec) 0
Working Set (MB) 1,333
root@c59603927169:~# top
Tasks: 10 total, 1 running, 9 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.7 us, 2.2 sy, 0.5 ni, 91.6 id, 2.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 7817.4 total, 3784.4 free, 2611.4 used, 1421.6 buff/cache
MiB Swap: 4096.0 total, 3662.4 free, 433.6 used. 4995.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 30 10 5.4m 3.1m 2.8m S 0.0 0.0 0:00.29 startup.sh
36 root 30 10 13.5m 2.9m 2.1m S 0.0 0.0 0:00.00 sshd
40 root 30 10 5.4m 2.2m 1.9m S 0.0 0.0 0:00.00 bash
41 root 30 10 12.0g 66.4m 31.1m S 0.0 0.8 0:02.02 DiagServer
44 root 30 10 3535.1m 120.8m 58.1m S 0.0 1.5 0:04.29 dotnet-monitor
63 root 30 10 7.1m 2.3m 2.0m S 0.0 0.0 0:00.00 cron
72 root 30 10 4716.9m 1.2g 97.4m S 0.0 16.3 7:09.93 dotnet
392 root 30 10 13.5m 6.9m 6.0m S 0.0 0.1 0:00.07 sshd
394 root 30 10 5.6m 3.5m 3.0m S 0.0 0.0 0:00.02 bash
634 root 30 10 9.6m 3.3m 2.9m R 0.0 0.0 0:00.01 top
In addition I ran dotnet-counters collect
for a duration of time while simulating typical user load.
dotnet-counters collect --output dotnetcollection.csv --refresh-interval 10 --process-id 80
Results available here: dotnetcollection.csv
E.g. GC graphed over the duration. Most memory is released after use by the GC:
but dotnet holds on to the memory forever (via reserved) even though the app only needs a much smaller amount:
In addition dotnet will increase memory usage very easily until nearly all available memory is used (as seen in original comment).
I've just been reading about performance optimizations in .net 7 and I'm curious if this one might have any bearing on this.
First interesting (I think) thing the linked blogpost mentions is 'segments' (default in .net 6) are large (1GB+) areas of heap memory with server mode, but with workstation, you get smaller by default (256)
“Regions” is a feature of the garbage collector (GC) that’s been in the works for multiple years. It’s enabled by default in 64-bit processes in .NET 7 as of dotnet/runtime#64688, but as with other multi-year features, a multitude of PRs went into making it a reality. At a 30,000 foot level, “regions” replaces the current “segments” approach to managing memory on the GC heap; rather than having a few gigantic segments of memory (e.g. each 1GB), often associated 1:1 with a generation, the GC instead maintains many, many smaller regions (e.g. each 4MB) as their own entity. This enables the GC to be more agile with regards to operations like repurposing regions of memory from one generation to another. For more information on regions, the blog post Put a DPAD on that GC! from the primary developer on the GC is still the best resource.
So what are the key differences between segments and regions? Segments are large units or memory – on Server GC 64-bit if the segment sizes are 1GB, 2GB or 4GB each (for Workstation it's much smaller – 256MB) on SOH. Regions are much smaller units, they are by default 4MB each. So you might ask, "so they are smaller, why is that significant?". To answer that, let's first review how segments work.
Second interesting thing is this bit about how its hard in pratice to reclaim committed memory:
We do decommit on a segment but only the end of the segment which is after the very last live object on that segment (denoted by the light gray space at the end of each segment). And if you have pinning that prevents the GC from retracting the end of the segment, then we can only form free spaces and free spaces are always committed memory.
So I would speculate that what happens in your app, is that your GC has some large segments, it allocates a whole lot of objects in them, and eventually something gets pinned while you have a lot of garbage in the heap.
After a while you reclaim all that garbage, but the GC can't uncommit your memory.
This suggests to me a couple relatively simple experiments you could try, to see if they make a difference:
1) Use workstation GC mode. OK you've tried that one, too bad it didn't help. 2) Update to .Net 7
PS, one likely pertinence difference between Linux and Windows, is that Linux doesn't reclaim idle pages the way Windows does. I.e. resident committed memory stays resident.
I have tried on .NET 8 and can confirm that the problem still exists :(
Memory Dump is 3.6 GB
C:\code\tmp\net8mem> ls
Directory: C:\code\tmp\net8mem
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 29/02/2024 14:13 3676049408 ln0sdlwk00027F_dotnet_102_core_20240229_135120_495
Which correlates to the memory being used on the Linux App Service Plan
When analysing using dotnet-dump analyze
and runningeeheap
I see that
Total bytes consumed by CLR: 0xac3c000 (180600832)
is only 180 Mb.
And using dumpheap
we see the summary as
Total 477,318 objects, 46,188,135 bytes
Which is only 46 Mb.
How do I analyze the other 3,420 Mb ?
<GarbageCollectionAdaptationMode>1</GarbageCollectionAdaptationMode>
) but that hasn't changed anything for me.The root cause for my issue was not to do with my app or event dotnet at all - it was to do with glib malloc
. There is some dynamic sizing of memory allocation blocks going on in malloc and when an app uses lots of memory, fragmentation occurs and the memory is reluctantly released. A better write-up can be found here: https://github.com/dotnet/runtime/issues/13301#issuecomment-535641506
This is not just affecting me but other programs are affected outside of dotnet. E.g.,
There appears to be 2 workarounds,
glibc.malloc.trim_threshold
to stop it being dynamic.glibc.malloc.arena_max
as the default number is 8 * NumOfCores
. More information can be found here: https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html
Ultimately I chose to set the environment variable MALLOC_TRIM_THRESHOLD_=131072
which fixed the problem for me.
In Azure you can conveniently use the Configuration to do this,
Which I can confirm is set via bash,
kudu_ssh_user@ff3d088f400d:/$ env | grep MALLOC
APPSETTING_MALLOC_TRIM_THRESHOLD_=131072
MALLOC_TRIM_THRESHOLD_=131072
And now I can see memory being released in my Web App 😅,
I am seeing extremely high memory usage from my dotnet application when running under Linux. The memory is also never released. When users use my web app they do require large amounts of memory (1-2 GB to run a report) but when they close the tab resources are released and I would expect dotnet to do the same to some extent.
What I'm experiencing is near 90-95% memory usage from my dotnet app.
Problems
Background Info
Memory Diagnostics from the server
The dotnet process is using about 100 MB of memory whereas the top command shows it is reserving about 5.5 GB. I understand it reserves memory when there is no memory pressure but reserving 50x the amount needed seems a bit much?
Server and Runtime Information
Azure App Service p1v3 running Linux
The app is published before being deployed to Azure.
What have I done already
dotnet-dump collect --type Full
and searched for any memory pressure from within my app, details below.Memory Analysis of my App
The dotnet memory dump was 6 GB in size
but upon opening it in dotMemory I see no real memory pressure,