Expected behavior of gen 2 disposed finalizable objects

joshfree commented 6 years ago

Opening on behalf of @ayende - issue moved from https://github.com/dotnet/corefx/issues/26659 to the coreclr repo:

I'm investigating high memory utilization in one of our systems, and I seeing very strange results:

Heap 0
generation 0 has 100 finalizable objects (000002809db65198->000002809db654b8)
generation 1 has 234 finalizable objects (000002809db64a48->000002809db65198)
generation 2 has 36105 finalizable objects (000002809db1e200->000002809db64a48)
Ready for finalization 0 objects (000002809db654b8->000002809db654b8)
------------------------------
Heap 1
generation 0 has 282 finalizable objects (00000280463019f0->00000280463022c0)
generation 1 has 777 finalizable objects (00000280463001a8->00000280463019f0)
generation 2 has 39371 finalizable objects (00000280462b3350->00000280463001a8)
Ready for finalization 0 objects (00000280463022c0->00000280463022c0)

The final tally is 76,869 objects in the queue. I am doing a lot to try to reuse instances, so it is not surprising that a lot of them are actually in gen 2. What is surprising to me is that in many cases, I have memory hanging around and the only reference to it is in the finalization queue.

I checked the finalizer thread, and it is not blocked. In fact, it appears to be idle.

0:004> k
 # Child-SP          RetAddr           Call Site
00 000000e0`1efff2d8 00007ffc`4074796f ntdll!NtWaitForMultipleObjects+0x14
01 000000e0`1efff2e0 00007ffc`308a6361 KERNELBASE!WaitForMultipleObjectsEx+0xef
02 000000e0`1efff5e0 00007ffc`308a5de2 coreclr!FinalizerThread::WaitForFinalizerEvent+0x85 [e:\a\_work\1791\s\src\vm\finalizerthread.cpp @ 469] 
03 000000e0`1efff620 00007ffc`307fd66b coreclr!FinalizerThread::FinalizerThreadWorker+0x62 [e:\a\_work\1791\s\src\vm\finalizerthread.cpp @ 587] 
04 000000e0`1efff680 00007ffc`307fd586 coreclr!ManagedThreadBase_DispatchInner+0x43 [e:\a\_work\1791\s\src\vm\threads.cpp @ 9204] 
05 000000e0`1efff6c0 00007ffc`307fd498 coreclr!ManagedThreadBase_DispatchMiddle+0x82 [e:\a\_work\1791\s\src\vm\threads.cpp @ 9253] 
06 000000e0`1efff820 00007ffc`308a587c coreclr!ManagedThreadBase_DispatchOuter+0xb4 [e:\a\_work\1791\s\src\vm\threads.cpp @ 9492] 
07 (Inline Function) --------`-------- coreclr!ManagedThreadBase_NoADTransition+0x41 [e:\a\_work\1791\s\src\vm\threads.cpp @ 9575] 
08 (Inline Function) --------`-------- coreclr!ManagedThreadBase::FinalizerBase+0x41 [e:\a\_work\1791\s\src\vm\threads.cpp @ 9601] 
09 000000e0`1efff8d0 00007ffc`308a73fb coreclr!FinalizerThread::FinalizerThreadStart+0x9c [e:\a\_work\1791\s\src\vm\finalizerthread.cpp @ 774] 
0a 000000e0`1efff970 00007ffc`40e98364 coreclr!Thread::intermediateThreadProc+0x8b [e:\a\_work\1791\s\src\vm\threads.cpp @ 2594] 
0b 000000e0`1efffa30 00007ffc`43787091 kernel32!BaseThreadInitThunk+0x14
0c 000000e0`1efffa60 00000000`00000000 ntdll!RtlUserThreadStart+0x21

It looks like it is here: https://github.com/dotnet/coreclr/blob/master/src/vm/finalizerthread.cpp#L462

What is really strange is that I have a large number of objects there are hanging in the finalizer queue that I'm quite sure are properly disposed. And my dispose for them includes GC.SuppressFinalizer(this).

I don't have any calls to GC.ReRegisterForFinalization and I'm not patching references back to my objects from beyond the grave. I dumped the state of some of the objects, and the plot thickens, because their state shows that they have been disposed properly and that the suppress finalizer was called.

0:004> !DumpObj /d 0000027c88389c38
Name:        Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.UvTcpHandle
MethodTable: 00007ffbd1c8da08
EEClass:     00007ffbd1c98d80
Size:        120(0x78) bytes
File:        D:\RavenDB\Server\Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffc224c64e0  4001741        8        System.IntPtr  1 instance                0 handle
00007ffc224c6298  4001742       10         System.Int32  1 instance                3 _state
00007ffc224a8388  4001743       14       System.Boolean  1 instance                1 _ownsHandle
00007ffc224a8388  4001744       15       System.Boolean  1 instance                1 _fullyInitialized
00007ffbd1c8b260  400009a       18 ...ng.LibuvFunctions  0 instance 0000027b7d03da00 _uv
00007ffc224c6298  400009b       28         System.Int32  1 instance               21 _threadId
00007ffbd1bdecc0  400009c       20 ...ernal.ILibuvTrace  0 instance 0000027b7d0267b8 _log
00007ffc22457218  400009d       2c         System.Int32  1 instance                0 _handleType
00007ffbd1ca1df8  4000099       30 ...Private.CoreLib]]  0 instance 0000027b7d04da08 _queueCloseHandle
00007ffbd1c8cef0  4000098       80 ...tions+uv_close_cb  0   static 0000027b7d074468 _destroyMemory
00007ffbd1cba8b8  40000a1       38 ...Private.CoreLib]]  0 instance 0000000000000000 _listenCallback
00007ffc224aaf30  40000a2       40        System.Object  0 instance 0000000000000000 _listenState
00007ffc224b08f8  40000a3       60 ...Services.GCHandle  1 instance 0000027c88389c98 _listenVitality
00007ffbd1d2edc8  40000a4       48 ...Transport.Libuv]]  0 instance 0000000000000000 _allocCallback
00007ffbd1d2eeb8  40000a5       50 ...Private.CoreLib]]  0 instance 0000000000000000 _readCallback
00007ffc224aaf30  40000a6       58        System.Object  0 instance 0000000000000000 _readState
00007ffc224b08f8  40000a7       68 ...Services.GCHandle  1 instance 0000027c88389ca0 _readVitality
00007ffbd1c8e5c8  400009e       88 ...+uv_connection_cb  0   static 0000027b7d0560d0 _uv_connection_cb
00007ffbd1c8f070  400009f       90 ...tions+uv_alloc_cb  0   static 0000027b7d056128 _uv_alloc_cb
00007ffbd1c8f208  40000a0       98 ...ctions+uv_read_cb  0   static 0000027b7d056168 _uv_read_cb
0:004> !gcroot 0000027c88389c38
Finalizer Queue:
    0000027c88389c38
    -> 0000027c88389c38 Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.UvTcpHandle

I'm using UvTcpHandle because is inherits from SafeHandle and I assume that this one is properly used :-).

Tracing the code, this instance is disposed (it's state is 3, which means that it has been closed). And when it is disposed it is marked as skip finalizer: https://github.com/dotnet/coreclr/blob/60222780cde77346bff1eb8979846769c223f833/src/vm/safehandle.cpp#L240

So the question is, what is it still doing in the finalizer queue?

I expected that when this is called (or the more usual GC.SuppressFinalizer , it'll be removed from this queue.

In particular, I think that I'm asking about what is the expected behavior from the system. At what point the GC will collect such instances? And what is the expected state of the system in such a scenario?

Another note here. Some of the objects that I would expect to be removed from the finalizer queue are holding a reference to a byte array that is 256KB, so it is on the LOH, might be related? OTOH, the other instances are not holding LOH objects...

Okay, I think that I had a major misunderstanding on my part. I expected the SuppressFinalize to remove the object from the finalization queue, and it doesn't do that, it only set a bit that it shouldn't call the Finalize method. Given that, the rest of the behavior is quite clear.

One thing that still remains a question for me is what is the expected behavior from the rest of the system. We have a lot of items in the finalization queue, many of them are only reachable via the queue. Many of them are marked as not needing to call finalize on them. The question is, when will the finalizer queue start removing these values?

I assume that this happens as part of a GC run? And in that case, given that we try to write code that pool and doesn't allocate too much, this is going to be deferred, so we essentially created a lot of garbage, then avoid creating more so the garbage man never comes to clean it up?

Maoni0 commented 6 years ago

the output for !fq has 2 parts - "finalizable objects" and "Ready for finalization objects". all "finalizable objects" says is it's a finalizable object, but since GC has not looked at those objects it doesn't know whether they are dead or not. so it hasn't marked them as "Ready for finalization". if you haven't done a gen2 GC for a while, then your finalizable objects in gen2 are not going away because GC hasn't looked at them; when you do do a gen2 GC, if they are indeed no user references to them they will be "ready for finalization" which means as soon as the GC is over, the finalizer thread will pick them off of the queue and run their finalizers (and then next time a gen2 GC happens, they will be collected).

albigi commented 6 years ago

We are seeing a very similar situation in a Service Fabric cluster. Each node runs several apps (ASP.NET Core APIs) relying on Kestrel. When the cluster nodes are brought live with several thousands of users hitting them, we observe a constant memory increase within all the APIs. The node will eventually become almost unresponsive as all the physical memory is used and the paging file almost doubles the RAM (20GB x 2). The APIs are completely stateless and data is disposed upon utilization (so it is surprising seeing single APIs consuming up to 8/9GB each)

Data we managed to collect so far in the dev environment does not show the application with such a stressful workload, however, we could observe a similar memory usage pattern. What is concerning us mostly right now is Kestrel Libuv handles seem to keep alive tons of other objects (via Connection and HttpContext refs) which are causing the high memory.

A very remarkable difference is obtained by running the apps with the WKS vs SRV GC. Indeed, SRV GC seems to be way too conservative even when the memory pressure arises. WKS GC on the other end runs much more (as expected) but definitely helps in preventing lots of garbage to stick around.

Some numbers

============================================================= SRV mode

MEM_COMMIT 2693 0349e9000 ( 841.910 MB) MEM_RESERVE 719 213a7b000 ( 8.307 GB) Total 1,198,950 Object(s), Total Size: 157.19 MB, Free Objects 28,381(157.12 MB)

Finalizable objs: Heap 0 generation 0 has 6367 finalizable objects (000000002d7ae118->000000002d7ba810) generation 1 has 25 finalizable objects (000000002d7ae050->000000002d7ae118) generation 2 has 1538 finalizable objects (000000002d7ab040->000000002d7ae050) Ready for finalization 0 objects(000000002d7ba810->000000002d7ba810)

Heap 1 generation 0 has 12616 finalizable objects (000000002df20fa8->000000002df399e8) generation 1 has 19 finalizable objects (000000002df20f10->000000002df20fa8) generation 2 has 2010 finalizable objects (000000002df1d040->000000002df20f10) Ready for finalization 0 objects (000000002df399e8->000000002df399e8) Statistics for all finalizable objects (including all objects ready for finalization):Total 22575 objects

Most consuming allocs:

: 157 MB (100.00%) System.Byte[] : 42 MB (26.69%) System.String : 41 MB (26.03%) System.Char[] : 21 MB (13.55%) System.Diagnostics.Tracing.EventSource+EventMetadata[] : 4.0 MB (2.54%) Bytes in All Heaps 350,097,000 GCHandles 20,147 **GEN 0 Collections 462 GEN 1 Collections 367 GEN 2 Collections 149** no Induced GCs 0 no of Pinned Objects 135 Sync Blocks in use 813 **Finalization Survivors 155** **Total Commited Bytes 401,125,376 Total Reserved Bytes 9,126,789,120** GEN 0 Heap Size 107,369,520 GEN 1 Heap Size 6,858,864 GEN 2 Heap Size 43,178,448 LOH Size 192,690,168 % Time in GC 0.84% ============================================================= **WKS mode** MEM_COMMIT 2713 0`22b52000 ( 555.320 MB) MEM_RESERVE 724 0`1f906000 ( 505.023 MB) Total 574,746 Object(s), Total Size: 77.09 MB, Free Objects 26,665(103.91 MB) Finalizable objs: generation 0 has 3 finalizable objects (000000002479fac0->000000002479fad8) generation 1 has 0 finalizable objects (000000002479fac0->000000002479fac0) generation 2 has 4004 finalizable objects (0000000024797da0->000000002479fac0) Ready for finalization 0 objects (000000002479fad8->000000002479fad8) Statistics for all finalizable objects (including all objects ready for finalization): Total 4007 objects Most consuming allocs: : 77 MB (100.00%) System.Byte[] : 35 MB (45.85%) System.String : 5.1 MB (6.66%) System.Diagnostics.Tracing.EventSource+EventMetadata[] : 4.0 MB (5.17%) System.RuntimeType : 2.5 MB (3.20%) System.Reflection.RuntimeMethodInfo : 2.3 MB (2.96%) Counter Value ====================== =========== Bytes in All Heaps 196,083,200 GCHandles 20,559 **GEN 0 Collections 5,819 GEN 1 Collections 3,650 GEN 2 Collections 676** no Induced GCs 0 no of Pinned Objects 46 Sync Blocks in use 251 **Finalization Survivors 3** **Total Commited Bytes 195,694,592 Total Reserved Bytes 536,858,624** GEN 0 Heap Size 6,291,456 GEN 1 Heap Size 33,776 GEN 2 Heap Size 55,698,512 LOH Size 134,059,456 % Time in GC 2.58% ============================================================= what we are trying next is to run the app with **gcTrimCommitOnLowMemory** set. Expected behavior? Is it just the GC does not think a GEN2 collection is needed or are we possibly doing something wrong?

benaadams commented 6 years ago

Related "Way to increase x64 GC aggressiveness? (as per x86)" https://github.com/dotnet/coreclr/issues/11338

Maoni0 commented 6 years ago

you are doing gen2's based on the perf counter value. of course the perf counter gen2 collection value just doesn't tell if it's a blocking gen2 or background. you can tell by looking at the total committed bytes right after a gen2 - if it drops a lot it means you are doing blocking gen2s and if you are, it simply means you just have that much memory that needs to be kept alive unless your finalization survivors actually hold onto the majority of the memory (which would be very bad to begin with). the # of finalization survivors does not tell us how much memory is actually survived due to finalization. an ETW trace would be a lot more useful.

albigi commented 6 years ago

thanks @Maoni0 we eventually ran a PerfView trace and tracked this down to a leak of custom objects hooked up to System.ServiceModel and never released after performing WCF client calls bottom line the memory consumption was actually legitimate and gen2 survivors were starving the whole heap eventually causing very bad performance due to frequent GCing

mangod9 commented 3 years ago

It appears this has been answered hence closing this issue. Please reopen if further clarification is required.

dotnet / runtime

Expected behavior of gen 2 disposed finalizable objects #9630