Closed christianscheuer closed 6 years ago
I added manual tracing code to my code equal to the following:
Console.WriteLine("Step 1: " + (Thread.CurrentThread.ManagedThreadId == Environment.CurrentManagedThreadId));
and was able to verify that this changes from True to False. This happens consistently in my code on every run, but at different places. In one instance it changes after a new Dictionary<...>
, which leads me to suspect GC is involved.
In that repro, no other threading or task related code is run.
It consistently works on CoreCLR (stays True).
I added a while(true) in my Main so that I could sample the process in Activity Monitor.
This is how it looks after the Environment.CurrentManagedThreadId change. It looks like my Main method (after returning from the code that because of GC or something else makes the Environment.CurrentManagedThreadId change) is still actually running on the main thread, indicating that Thread.CurrentThread.ManagedThreadId
is right by saying it's still 1
, whereas Environment.CurrentManagedThreadId
is wrong (now saying 2
not 6
as mentioned in the OP because I simplified the project to make a cleaner repro).
Call graph:
2465 Thread_1605348 DispatchQueue_1: com.apple.main-thread (serial)
+ 2465 start (in libdyld.dylib) + 1 [0x7fffcd7c0235]
+ 2465 main (in sfbackend) + 169 [0x100071c69] main.cpp:376
+ 2465 __managed__Main (in sfbackend) + 268 [0x1009faaa0]
+ 2465 sfbackend_SoundFlow_Shortcuts_Program__Main (in sfbackend) + 404 [0x10060b8f8]
2465 Thread_1605351
+ 2465 start_wqthread (in libsystem_pthread.dylib) + 13 [0x7fffcd9d907d]
+ 2465 _pthread_wqthread (in libsystem_pthread.dylib) + 1023 [0x7fffcd9d948e]
+ 2465 __workq_kernreturn (in libsystem_kernel.dylib) + 10 [0x7fffcd8ef44e]
2465 Thread_1605355
2465 thread_start (in libsystem_pthread.dylib) + 13 [0x7fffcd9d908d]
2465 _pthread_start (in libsystem_pthread.dylib) + 286 [0x7fffcd9d9887]
2465 _pthread_body (in libsystem_pthread.dylib) + 180 [0x7fffcd9d993b]
2465 FinalizerStart(void*) (in sfbackend) + 69 [0x100073885] FinalizerHelpers.cpp:65
2465 ProcessFinalizers (in sfbackend) + 42 [0x100739f7e]
2465 RhpWaitForFinalizerRequest (in sfbackend) + 55 [0x1000739c7] PalRedhawkFunctions.h:172
2465 UnixEvent::Wait(unsigned int) (in sfbackend) + 331 [0x1000bd74b] PalRedhawkUnix.cpp:329
2465 _pthread_cond_wait (in libsystem_pthread.dylib) + 712 [0x7fffcd9da7fa]
2465 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fffcd8eebf2]
Environment.CurrentManagedThreadId
is simple method - it just calls ManagedThreadId.GetCurrentThreadId()
that just reads value from static. It compiles into a series of pointer dereferences. You can try single stepping through it in disassembly when it is returning good value, and then single stepping through again when it is returning wrong value. Comparing the two traces may give you clues what's wrong.
It is likely a problem with GC reporting again. Same as https://github.com/dotnet/corert/issues/5922#issuecomment-396008232 . It may be even the same underlying problem.
Thanks, @jkotas .
Stepping through the good and the bad case, it seems to take the same code paths. (I'm not hitting the Environment.CurrentManagedThreadId the first time it's changed, so not sure how much it's gonna help). But the following thread static call in ManagedThreadId simply returns 2
in the bad case , 1
in the good case (after derefencing eax+0x10, ie. getting access to the ):
__GetThreadStaticBase_S_P_CoreLib_System_Threading_ManagedThreadId
I'm thinking that maybe the finalizer is called on ManagedThreadId so the Id gets recycled.. How can
I add a breakpoint to RecycleId or the finalizer? If I do breakpoint set -r ManagedThreadId I don't seem to catch the finalizer nor the RecycleId (I'm guessing it's inlined). This is with
sfbackend`S_P_CoreLib_System_Threading_ManagedThreadId__get_Current:
0x1001698d8 <+0>: pushq %rbp
0x1001698d9 <+1>: subq $0x10, %rsp
0x1001698dd <+5>: leaq 0x10(%rsp), %rbp
0x1001698e2 <+10>: xorl %eax, %eax
0x1001698e4 <+12>: movl %eax, -0x4(%rbp)
0x1001698e7 <+15>: callq 0x10000a3e9 ; __GetThreadStaticBase_S_P_CoreLib_System_Threading_ManagedThreadId
0x1001698ec <+20>: movl 0x10(%rax), %eax
-> 0x1001698ef <+23>: movl %eax, -0x4(%rbp)
0x1001698f2 <+26>: cmpl $0x0, -0x4(%rbp)
0x1001698f6 <+30>: jne 0x100169904 ; <+44>
0x1001698f8 <+32>: callq 0x100169930 ; S_P_CoreLib_System_Threading_ManagedThreadId__MakeForCurrentThread
0x1001698fd <+37>: nop
0x1001698fe <+38>: leaq (%rbp), %rsp
0x100169902 <+42>: popq %rbp
0x100169903 <+43>: retq
0x100169904 <+44>: movl -0x4(%rbp), %eax
0x100169907 <+47>: leaq (%rbp), %rsp
0x10016990b <+51>: popq %rbp
0x10016990c <+52>: retq
0x10016990d <+53>: nop
0x10016990e <+54>: nop
0x10016990f <+55>: nop
I was able to set a breakpoint on the ManagedThreadId constructor (not sure why many of these SPCorelib function names are not found in lldb, had to use external disassembler).
So we can see MakeForCurrentThread gets called again, which leads to the wrong id assignment. Debugging it, I found out that the call to: __GetThreadStaticBase_S_P_CoreLib_System_Threading_ManagedThreadId
returns a new address for the Thread Static Base when the error is occuring. The thread static base + 0x10 (I'm guessing this is t_currentManagedThreadId) is 0 in the new thread static base, which then leeds to the call to MakeForCurrentThread.
Not sure how internals of tls are implemented in CoreRT. Please let me know if I can help further.
This is my backtrace when the ManagedThreadId gets recreated:
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 5.1
* frame #0: 0x00000001001699ac sfbackend`S_P_CoreLib_System_Threading_ManagedThreadId___ctor
frame #1: 0x0000000100169952 sfbackend`S_P_CoreLib_System_Threading_ManagedThreadId__MakeForCurrentThread + 34
frame #2: 0x00000001001698fd sfbackend`S_P_CoreLib_System_Threading_ManagedThreadId__get_Current + 37
frame #3: 0x00000001001f6d2d sfbackend`S_P_CoreLib_Internal_Runtime_Augments_EnvironmentAugments__get_CurrentManagedThreadId + 9
frame #4: 0x000000010008ee0d sfbackend`System_Runtime_Extensions_System_Environment__get_CurrentManagedThreadId + 9
I tested with a debug build of CoreRT (same issue as in release build) and then tried calling RhpEnableConservativeStackReporting()
where CPPCODEGEN
would've done it.
That unfortunately does not fix the issue, gives exact same behavior. So I don't know if that rules out GC as a root cause, or if even conservative stack reporting can have an influence on the thread static base being wrong.
In the GetThreadStaticBaseForType
helper, I found out that the code always takes the path as shown (eg. the early return storage[typeTlsIndex]
), so the issue does not come from the TSS being re allocated in this method.
The RhPrintInt
is just a helper method I put in to be able to log stuff to the console.
int moduleIndex = pModuleData->ModuleIndex;
object[] storage = (object[])RuntimeImports.RhGetThreadStaticStorageForModule(moduleIndex);
// Check whether thread static storage has already been allocated for this module and type.
if ((storage != null) && (typeTlsIndex < storage.Length) && (storage[typeTlsIndex] != null))
{
if (typeTlsIndex == 7) //managedthreadid
{
object st = storage[typeTlsIndex];
RuntimeImports.RhPrintInt(typeTlsIndex);
}
return storage[typeTlsIndex];
}
Was able to set a watchpoint on the address in memory where the original value of t_currentManagedThreadId was stored. Since that is stored in what looks like a managed object (the object[] ), I think it's likely that this object gets moved during GC but something goes wrong in that process - since subsequent reads of the t_currentManagedThreadId
member in ManagedThreadId's ThreadStaticBase returns 0. Really out of my comfort zone here though.
Here's the stacktrace from the watchpoint write.
thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 1
* frame #0: 0x000000010003668c sfbackend`WKS::memcopy(dmem="\x04", smem="\x04", size=96) at gc.cpp:1846
frame #1: 0x00000001000c81fb sfbackend`WKS::gc_heap::compact_plug(unsigned char*, unsigned long, unsigned int, WKS::gc_heap::compact_args*) [inlined] WKS::gc_heap::gcmemcopy(dest="\x10\x94\x87", src="\x10\x94\x87", len=128, copy_cards_p=YES) at gc.cpp:24464
frame #2: 0x00000001000c8056 sfbackend`WKS::gc_heap::compact_plug(plug="\x10\x94\x87", size=128, check_last_object_p=NO, args=0x00007fff5fbfbf50) at gc.cpp:24561
frame #3: 0x00000001000c9b67 sfbackend`WKS::gc_heap::compact_in_brick(tree="\x90Ԁ", args=0x00007fff5fbfbf50) at gc.cpp:24691
frame #4: 0x00000001000c981c sfbackend`WKS::gc_heap::compact_in_brick(tree="@/\x80", args=0x00007fff5fbfbf50) at gc.cpp:24662
frame #5: 0x00000001000c9e27 sfbackend`WKS::gc_heap::compact_in_brick(tre frame #5: 0x00000001000c9e27 sfbackend`WKS::gc_heap::compact_in_brick(tre frame #5: 0x00000001000c9e27 sfbackend`WKS::gc_heap::compact_in_brick(tre frame #5: 0x00000001000c9e27 sfbackend`WKS::gc_heap::compact_in_brick(tree="\x10\x94\x87", args=0x00007fff5fbfbf50) at gc.cpp:24706
frame #6: 0x00000001000c9e27 sfbackend`WKS::gc_heap::compact_in_brick(tree="?\x9c\x87", args=0x00007fff5fbfbf50) at gc.cpp:24706
frame #7: 0x00000001000c981c sfbackend`WKS::gc_heap::compact_in_brick(tree="\x10\x94\x87", args=0x00007fff5fbfbf50) at gc.cpp:24662
frame #8: 0x00000001000b9131 sfbackend`WKS::gc_heap::compact_phase(condemned_gen_number=1, first_condemned_address="\x90b\x8a", clear_cards=NO) at gc.cpp:24848
frame #9: 0x000000010008174f sfbackend`WKS::gc_heap::plan_phase(condemned_gen_number=1) at gc.cpp:22631
frame #10: 0x000000010006fbe5 sfbackend`WKS::gc_heap::gc1() at gc.cpp:15273
frame #10: 0x000000010006fbe5 sfbackend`WKS::gc_heap::gc1() at gc.cpp:15273
frame #11: 0x000000010008e3bf sfbackend`WKS::gc_heap::garbage_collect(n=0) at gc.cpp:16861
frame #12: 0x000000010005ff40 sfbackend`WKS::GCHeap::GarbageCollectGeneration(this=0x0000000101704440, gen=0, reason=reason_alloc_soh) at gc.cpp:35196
frame #13: 0x0000000100065380 sfbackend`WKS::gc_heap::try_allocate_more_space(acontext=0x0000000101707150, size=24, gen_number=0) at gc.cpp:13074
frame #14: 0x0000000100065543 sfbackend`WKS::gc_heap::allocate_more_space(acontext=0x0000000101707150, size=24, alloc_generation_number=0) at gc.cpp:13370
frame #15: 0x00000001000e8434 sfbackend`WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) at gc.cpp:13401
frame #16: 0x00000001000e8300 sfbackend`WKS::GCHeap::Alloc(this=0x0000000101704440, context=0x0000000101707150, size=24, flags=0) at gc.cpp:34542
frame #17: 0x0000000100016be3 sfbackend`::RhpGcAlloc(pEEType=0x0000000100807f30, uFlags=0, cbSize=24, pTransitionFrame=0x00007fff5fbfe180) at gcrhenv.cpp:273
frame #18: 0x00000001001134dd sfbackend`RhpNewObject at AllocFast.S:85
frame #19: 0x00000001003ed68f sfbackend`S_P_CoreLib_System_Span_1<Char>___ctor + 91
This is with a call to RhpEnableConservativeStackReporting();
in static int InitializeRuntime
in main.cpp
.
I think you are getting closer to find the root cause of the problem!
The storage for thread statics is created by Thread::SetThreadStaticStorageForModule
. This method allocates the GC handle using RhpHandleAlloc
that is supposed to keep the Object[]
alive and that is supposed to be update by the GC when the Object[]
moves.
It looks like that there is something wrong with this handle. Could you please check whether it got updated by the GC when the Object[]
moved?
One possible explanation of the problem is that the handle is freed prematurely somehow. You can try adding tracing to Thread::SetThreadStaticStorageForModule
and to RhHandleFree
to check for this possibility.
@jkotas thanks, great that I'm not completely off path at least :) I got to the same conclusion. It definitely seems like the RhpHandleAlloc call has no effect. I tried commenting out the RhHandleFree calls, and tried changing the handle type being allocated to a pinned reference. I even tried pinning it via a second call in the ThreadStatics.cs. I also tried calling RhHandleGet in favor of the Object** dereference, but all turned out the same way. Any further ideas? I'll definitely try to add some tracing as well in the C++ code instead of having to do it all in lldb, might reveal more info quicker that way.
For the handle being updated, I am pretty sure it gets updated, since one of my initial findings was that the address had changed. But maybe the handle is updated to point to a wrong place. Should I add tracing around the place in gc/objecthandle where the handle is supposed to be updated to point to the moved object? Also I was wondering, is it possible that it's the objects being pointed to by the object[] that get wrongly moved / not updated?
It is possible.
All object reference updates when the GC moves things around go through GCHeap::Relocate
so you can try adding a logging there.
It would be best to have the stresslog for this that I have mentioned on the other issue. All interesting places in GC where the objects are scanned and updated are instrumented using stresslog. GCHeap::Relocate
is instrumented using it as well. If you prefer debugging using printf, you may be able replace the STRESS_LOG
macros with printfs in your local build.
Wonderful, thanks @jkotas. I wasn't against using the stress log, just never got around to wrapping my head around it. The instructions for the stress log mention sos on Windows. How would I go about dumping it from lldb on macOS? Or would you say it would be easier to simply alter the macros to printf?
Never mind, printf was easy! I'll attach a log as soon as possible.
dumping it from lldb on macOS
lldb that comes with XCode cannot load the sos plugin. It is necessary to build lldb from sources: https://github.com/dotnet/coreclr/blob/master/Documentation/building/debugging-instructions.md#debugging-coreclr-on-os-x
Never mind, printf was easy! I'll
👍
I switched out some STRESSLOG macros with printf, added some instrumentation here and there switched on conservative stack reporting and added a few pinning handles that never get freed as part of my tests. See complete delta here: https://github.com/christianscheuer/corert/tree/stresslog
The repro code itself is a recursive text analyzing algorithm. It is instrumented with calls that check if Console.WriteLine("PEM.AddNodeItemEntry 1: " + (Thread.CurrentThread.ManagedThreadId == Environment.CurrentManagedThreadId)); The first static part is just an identifier that tracks in which method the trace is from. So the log contains a lot of "...: True" until it suddenly becomes "...: False". That is where it's interesting.
Furthermore I have added some tracing to S/GetThreadStaticStorageForModule
so we can track the handle.
The last good state is right before a generation 2 GC at line 834681 in the log.
GetThreadStaticStorageForModule 421363 threadStaticsStorageHandle = 0x102c32848
GetThreadStaticStorageForModule 421363 *threadStaticsStorageHandle = 0x10f01e1f0
the 421363
is the native thread id.
Under the GcScanHandles (Promotion Phase = 1)
phase the following is printed:
GCHeap::Promote: Promote GC Root *0x102c32848 = 0x10f01e1f0 MT = 0x100876f00T
so seems like the handle does get promoted (it also seems to be present in earlier GC scans).
But then a little later in line 835765 it seems like my code is still running while the GC is running (root issue maybe), and that it transitions from True to False within that (ie. maybe the thread is not correctly hijacked/paused). Not sure if there's some contention about getting access to the console out and if that plays in.
GCHeap::Promote: Promote GC Root *0x102c328f0 = 0x10efc8908 MT = 0x10089b720T
GCHeap::Promote: Promote GC Root *0x102c328f8 = 0x10efc88e8 MT = 0x10089b720T
GCHeap::Promote: Promote GC Root *0x102c32900 = 0x10eIndexItemCharacter 1: True
IndexItemCharacter 2: False
fc88d0 MT = 0x10089b698T
GCHeap::Promote: Promote GC Root *0x102c32908 = 0x10efc88b8 MT = 0x10089b698T
GCHeap::Promote: Promote GC Root *0x102c32910 = 0x10efc88a0 MT = 0x10089b698T
GCHeap::Promote: Promote GC Root *0x102c32918 = 0x10efc8888 MT = 0x10089b698T
GCHeap::Promote: Promote GC Root *0x102c32920 = 0x10efc8870 MT = 0x10089b698T
The end of this GC reads:
GC Heap 0x0
Generation 2 [0x10efc3000, 0x0] cur = 0x0
Segment mem 0x100720008 alloc = 0x100763f78 used 0x100763f78 committed 0x100763f78
Segment mem 0x10efc3000 alloc = 0x10f177380 used 0x10fd3d5b0 committed 0x10fd44000
Generation 1 [0x10f149108, 0x0] cur = 0x0
Segment mem 0x10efc3000 alloc = 0x10f177380 used 0x10fd3d5b0 committed 0x10fd44000
Generation 0 [0x10f177368, 0x0] cur = 0x0
Segment mem 0x10efc3000 alloc = 0x10f177380 used 0x10fd3d5b0 committed 0x10fd44000
========== ENDGC 102 (gen = 2, collect_classes = 0) ===========}
GetThreadStaticStorageForModule 421363 threadStaticsStorageHandle = 0x102c32848
GetThreadStaticStorageForModule 421363 *threadStaticsStorageHandle = 0x10f01e1f0
GetThreadStaticStorageForModule 421363 threadStaticsStorageHandle = 0x102c32848
GetThreadStaticStorageForModule 421363 *threadStaticsStorageHandle = 0x10f01e1f0
GetThreadStaticStorageForModule 421363 threadStaticsStorageHandle = 0x102c32848
GetThreadStaticStorageForModule 421363 *threadStaticsStorageHandle = 0x10f01e1f0
GetThreadStaticStorageForModule 421363 threadStaticsStorageHandle = 0x102c32848
GetThreadStaticStorageForModule 421363 *threadStaticsStorageHandle = 0x10f01e1f0
GetThreadStaticStorageForModule 421363 threadStaticsStorageHAddNodeItem start: False
AddNodeItem 1: False
AddNodeItem 2: False
OnItemWordIndexed start: False
OnItemWordIndexed 2: False
What we can also see here is that the handle's address is not updated (!)
That handle was originally alloc'ed in line 4027, and no calls to RhHandleFree or DestroyHandle for that handle occur in the log. Initial alloc line 4026:
CreateHandle: 0x102c32848, type=2
SetThreadStaticStorageForModule 421363: calls RhpHandleAlloc. Handle = 0x102c32848, pStorage = 0x10f01e1f0
SetThreadStaticStorageForModule 421363: calls RhHandleFree on 0x102c32850
Here's a link to the complete, raw log https://drive.google.com/open?id=1rjjA0JHWWSs3J1fn4LQTnEWeveb0XtPe
Please let me know if I should make any adjustments to this in order to reveal more information.
it seems like my code is still running while the GC is running
I think it is just a problem with output buffering. You can try to add fflush(stdout)
at the end of the GC and see whether it goes away.
So far, the simplest explanation for the symptoms is that some part of the thread static object graph is not getting marked by the GC and the GC considers it garbage. Here are some ideas what to try:
Turn on heap verification by export RH_HeapVerify=1
. This will turn on GC heap integrity checks before and after the GC. I expect that this integrity check should catch this problem even faster than your tracing. It would be useful to confirm it.
Add extra tracing to GC to verify that the whole graph from the GCHandle is getting marked correctly. Add a line like printf ("*%p*, pMT=%p, newly marked: %d\n", (size_t)o, *(void**)o, marked);
to BOOL gc_heap::gc_mark1 (uint8_t* o)
method in gc.cpp. When I have done this in a simple program that just prints Environment.CurrentManagedThreadId
, I see that the thread static object graph is getting marked correctly:
*00000218D2746B00*, pMT=00007FF62C0E6491, newly marked: 1 // pMT is __Array<Object>::`vftable'
*00000218D2746B60*, pMT=00007FF62C0EEEC9, newly marked: 1 // pMT is _GCStaticEEType_01::`vftable'
*00000218D2746BA8*, pMT=00007FF62C09D051, newly marked: 1 // pMT is S_P_CoreLib_System_Threading_ManagedThreadId::`vftable':
GCHeap::Promote: Promote GC Root *00000218EABC2218 = 00000218D2746B00 MT = 00007FF62C0E6490T <- this is the threadStaticsStorageHandle GCHandle
Could you please find out which link of the chain from the GCHandle to the ManagedThreadId object is broken after the GC? The chain should be: GCHandle
-> object[]
-> GCStaticEEType
-> System.Threading.ManagedThreadId
. You can do this by setting a breakpoint at RhGetThreadStaticStorageForModule and stepping through the code. When the GCHandle is fetched, you should see get object with object[]
EEType. When the element of this array is fetched, you should get object with GCStaticEEType. When the field of this object is fetched, you should get object with ManagedThreadId.
Revert the change that adds RhpEnableConservativeStackReporting. We know that it is not helping, and it just adds a lot of clutter to the stack scanning.
It is interesting that the offending GC is the first GC with "requested generation = 2". You may be able to use this condition to turn on verbose tracing to reduce the size of the log if it becomes too hard to manage.
Thanks for helping us to trace this problem down!
Thanks for keeping with me, @jkotas ! :)
problem with output buffering. You can try to add
fflush(stdout)
You're right, it was just the output buffering. fflush
helped confirm that. I introduced flushing after most of the logging statements now, makes many things clearer now.
Revert the change that adds RhpEnableConservativeStackReporting
Done
t is interesting that the offending GC is the first GC with "requested generation = 2".
I thought this too, but with the changes and new runs, it now happens first time in a gen = 1 GC.
export RH_HeapVerify=1
I'm not sure I'm seeing the effects of this. Tried both running this in Terminal before building ILC and before invoking dotnet publish -r osx-x64
, doesn't seem to make much of a difference in the output I'm getting.
I haven't gotten around to stepping through and adding the more low level tracing yet.
But did another log with these new parameters. Log file here:
https://drive.google.com/file/d/1ukeZfhQYdWRt4SkVTpbTebudziC_ZCDE/view?usp=sharing
What strikes me now is this:
First we get a handle at 0x101732850 pointing to 0x1114fffe0
CreateHandle: 0x101732850, type=2
SetThreadStaticStorageForModule 100384: calls RhpHandleAlloc. Handle = 0x101732850, pStorage = 0x1114fffe0
Then it gets destroyed and a new one is allocated 0x101732848->0x111534d70:
SetThreadStaticStorageForModule 100384: calls RhpHandleAlloc. Handle = 0x101732848, pStorage = 0x111534d70
SetThreadStaticStorageForModule 100384: calls RhHandleFree on 0x101732850
DestroyHandle: *0x101732850->0x1114fffe0
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111534d70
Walking through the new log I now see this handle gets promoted in BEGINGC 2, line 16916:
GCHeap::Promote: Promote GC Root *0x101732848 = 0x111534d70 MT = 0x100875cd0T
It is also relocated:
GC Root 0x101732848 RELOCATED 0x111534d70 -> 0x111509720 MT = 0x100875cd0T
And taking the handle address after the GC2 is correctly updated (lines 17281):
========== ENDGC 2 (gen = 0, collect_classes = 0) ===========}
IndexItemCharacter 4: : True
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111509720
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111509720
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111509720
GetOrCreateChildNode 1: : True
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111509720
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111509720
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x111509720
In the BEGINGC 3 block (line 182811):
GCHeap::Promote: Promote GC Root *0x101732848 = 0x111509720 MT = 0x100875cd0T
and relocated (line 184051):
GC Root 0x101732848 RELOCATED 0x111509720 -> 0x1114fa570 MT = 0x100875cd0T
One thing that I'm noticing is that by the end of the 3rd GC, even now with a fflush(stdout) at the end of GC, our tracing of True/False seems to report before the GetThreadStaticStorageForModule calls.
========== ENDGC 3 (gen = 1, collect_classes = 0) ===========}
DestroyHandle: *0x101732b68->0x0
DestroyHandle: *0x101732b78->0x0
PEM.InsertEntry 1: : True
DestroyHandle: *0x101732b80->0x0
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
DestroyHandle: *0x101732b88->0x0
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x1114fa570
DestroyHandle: *0x101732b90->0x0
DestroyHandle: *0x101732ba0->0x0
DestroyHandle: *0x101732ba8->0x0
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
DestroyHandle: *0x101732bb0->0x0
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x1114fa570
DestroyHandle: *0x101732bb8->0x0
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x1114fa570
DestroyHandle: *0x101732bc0->0x0
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x1114fa570
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x1114fa570
PEM.InsertEntry 7: : False
DestroyHandle: *0x101732bf8->0x0
GetThreadStaticStorageForModule 100384 threadStaticsStorageHandle = 0x101732848
DestroyHandle: *0x101732b70->0x0
GetThreadStaticStorageForModule 100384 *threadStaticsStorageHandle = 0x1114fa570
DestroyHandle: *0x101732db8->0x0
Could the presence of our console.writeline call AFTER the GC indicate that a 'buffered' (and by this time, old) value of the thread static storage is present in the stack somehow, and that it's this value that doesn't get updated, leading to the error?
The PEM.AddNodeItemEntry 4: : True
calls are just stuff like
Console.WriteLine("PEM.AddNodeItemEntry 4: : " + (Thread.CurrentThread.ManagedThreadId == Environment.ManagedThreadId));
Console.Out.Flush();
My last idea there can be safely disregarded. I added a breakpoint for this build at
WKS::GCHeap::GarbageCollectGeneration
and the 3rd GC's stacktrace shows it's simply triggering GC during the flush:
* frame #0: 0x000000010005eb80 sfbackend`WKS::GCHeap::GarbageCollectGeneration(this=0x0000000000000000, gen=0, reason=reason_alloc_soh) at gc.cpp:35071
frame #1: 0x00000001000647d0 sfbackend`WKS::gc_heap::try_allocate_more_space(acontext=0x0000000102001050, size=152, gen_number=0) at gc.cpp:13074
frame #2: 0x0000000100064973 sfbackend`WKS::gc_heap::allocate_more_space(acontext=0x0000000102001050, size=152, alloc_generation_number=0) at gc.cpp:13370
frame #3: 0x00000001000e6ef4 sfbackend`WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) at gc.cpp:13401
frame #4: 0x00000001000e6dc0 sfbackend`WKS::GCHeap::Alloc(this=0x0000000102000140, context=0x0000000102001050, size=152, flags=0) at gc.cpp:34542
frame #5: 0x00000001000167b3 sfbackend`::RhpGcAlloc(pEEType=0x00000001007fd968, uFlags=0, cbSize=152, pTransitionFrame=0x00007fff5fbfdd20) at gcrhenv.cpp:273
frame #6: 0x0000000100111f83 sfbackend`RhpNewArrayRare at AllocFast.S:308
frame #7: 0x00000001001ad77d sfbackend`String__FastAllocateString + 69
frame #8: 0x00000001001b7c88 sfbackend`String__Concat_5 + 232
frame #9: 0x00000001003205f1 sfbackend`S_P_CoreLib_System_Text_UTF8Encoding__GetBytes_3 + 5065
frame #10: 0x0000000100307578 sfbackend`S_P_CoreLib_System_Text_EncoderNLS__GetBytes_0 + 404
frame #11: 0x00000001003073c3 sfbackend`S_P_CoreLib_System_Text_EncoderNLS__GetBytes + 863
frame #12: 0x00000001002afe44 sfbackend`S_P_CoreLib_System_IO_StreamWriter__Flush_0 + 468
frame #13: 0x00000001002afc68 sfbackend`S_P_CoreLib_System_IO_StreamWriter__Flush + 44
frame #14: 0x000000010017eac8 sfbackend`System_Console_System_IO_SyncTextWriter__Flush + 88
@jkotas Ugh I'm sorry for not being more helpful.
*00000218D2746B00*, pMT=00007FF62C0E6491, newly marked: 1 // pMT is __Array<Object>::'vftable'
Printing the pMT works fine with your code - but In my lldb session, how would I resolve the pMT to __Array
Maybe I'm missing some symbols or lacking some basic knowledge of lldb debugging.
Try image lookup --address 0x00007FF62C0E6491
Cool. So here are some results based on that.
1st GC:
*0x10300e000*, pMT=0x100875cd1, newly marked: 1 //pMT is vtable for __Array<Object> + 1
*0x1030206d0*, pMT=0x10089a4d9, newly marked: 1 //pMT is __GCStaticEEType_001 + 1
*0x103020720*, pMT=0x10081ea19, newly marked: 1 //pMT is vtable for S_P_TypeLoader_System_Collections_Generic_LowLevelDictionary_2<S_P_CoreLib_System_Runtime_TypeManagerHandle__S_P_TypeLoader_Internal_NativeFormat_NativeReader> + 1
*0x103020740*, pMT=0x10087cb49, newly marked: 1 //pMT is vtable for __Array<S_P_TypeLoader_System_Collections_Generic_LowLevelDictionary_2_Entry<S_P_CoreLib_System_Runtime_TypeManagerHandle__S_P_TypeLoader_Internal_NativeFormat_NativeReader>> + 1
*0x103020830*, pMT=0x100833979, newly marked: 1 //pMT is S_P_TypeLoader_System_Collections_Generic_LowLevelDictionary_2_Entry<S_P_CoreLib_System_Runtime_TypeManagerHandle__S_P_TypeLoader_Internal_NativeFormat_NativeReader> + 1
*0x1030207f8*, pMT=0x10080af69, newly marked: 0 //pMT is vtable for S_P_TypeLoader_Internal_NativeFormat_NativeReader + 1
*0x103007f38*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x103007f80*, pMT=0x10080d369, newly marked: 1 //pMT is vtable for S_P_CoreLib_System_Threading_ManagedThreadId + 1
*0x10300fd28*, pMT=0x10089a501, newly marked: 1 //pMT is __GCStaticEEType_011 + 1
*0x10300fd78*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x10300fd90*, pMT=0x100813161, newly marked: 1 //pMT is vtable for S_P_CoreLib_System_Text_StringBuilder + 1
*0x103019d00*, pMT=0x1008772a1, newly marked: 1 //pMT is vtable for __Array<Char> + 1
*0x10300e0c8*, pMT=0x10089a441, newly marked: 1 //pMT is __GCStaticEEType_0 + 1
Between 1st and 2nd GC, the handle gets destroyed and recreated:
SetThreadStaticStorageForModule 142287: calls RhpHandleAlloc. Handle = 0x1015a2850, pStorage = 0x10302a008
SetThreadStaticStorageForModule 142287: calls RhHandleFree on 0x1015a2858
DestroyHandle: *0x1015a2858->0x10300e000
GetThreadStaticStorageForModule 142287 threadStaticsStorageHandle = 0x1015a2850
GetThreadStaticStorageForModule 142287 *threadStaticsStorageHandle = 0x10302a008
And recreated again:
CreateHandle: 0x1015a2848, type=2
SetThreadStaticStorageForModule 142287: calls RhpHandleAlloc. Handle = 0x1015a2848, pStorage = 0x10305ed98
SetThreadStaticStorageForModule 142287: calls RhHandleFree on 0x1015a2850
DestroyHandle: *0x1015a2850->0x10302a008
GetThreadStaticStorageForModule 142287 threadStaticsStorageHandle = 0x1015a2848
GetThreadStaticStorageForModule 142287 *threadStaticsStorageHandle = 0x10305ed98
Growing heap_segment: Ix high address: Ix
New commit: IxGetThreadStaticStorageForModule 142287 threadStaticsStorageHandle = 0x1015a2848
GetThreadStaticStorageForModule 142287 *threadStaticsStorageHandle = 0x10305ed98
2nd GC:
*0x10305ed98*, pMT=0x100875cd1, newly marked: 1 //pMT is vtable for __Array<Object> + 1
*0x103056348*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x103056360*, pMT=0x1007f9501, newly marked: 0 //pMT is vtable for System_Threading_Thread_System_Threading_Thread + 1
*0x1030507a0*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x103053550*, pMT=0x10089a441, newly marked: 1 //pMT is __GCStaticEEType_0 + 1
*0x103044698*, pMT=0x10089a489, newly marked: 1 //pMT is __GCStaticEEType_000 + 1
*0x10302a0d8*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x10302a118*, pMT=0x100819909, newly marked: 0 //pMT is vtable for S_P_CoreLib_Internal_Runtime_Augments_RuntimeThread + 1
*0x1030611a0*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x1030622b8*, pMT=0x1008802c1, newly marked: 1 //pMT is vtable for __Array<__Array<UInt8>> + 1
*0x10312bae8*, pMT=0x100877161, newly marked: 1 //pMT is vtable for __Array<UInt8> + 1
*0x103062088*, pMT=0x100877161, newly marked: 1 //pMT is vtable for __Array<UInt8> + 1
*0x103061338*, pMT=0x100877161, newly marked: 1 //pMT is vtable for __Array<UInt8> + 1
*0x10305ee78*, pMT=0x10089a479, newly marked: 1 //pMT is __GCStaticEEType_01 + 1
*0x103126148*, pMT=0x100880381, newly marked: 1 //pMT is vtable for __Array<__Array<Char>> + 1
*0x10305f010*, pMT=0x1008772a1, newly marked: 1 //pMT is vtable for __Array<Char> + 1
GCHeap::Promote: Promote GC Root *0x1015a2848 = 0x10305ed98 MT = 0x100875cd0T
So it looks like, even though our problem first comes in 3rd GC (because that's when it gets relocated), already here the ManagedThreadId
does not get marked, if I'm reading this correctly.
already here the ManagedThreadId does not get marked,
I think that is because of the 2nd GC is gen0 GC. ManagedThreadId was promoted from gen0 to gen1 during the 1st GC. The 2nd GC is gen0 GC and so it does not scan it. The 3rd GC is gen1 GC - does it scan it?
Oops. Sent out the previous too soon. Here it is with all 3 GC's... Looks like it sets newly marked = 0 for the 3rd GC for ManagedThreadId.
CreateHandle: 0x1015a2860, type=2
SetThreadStaticStorageForModule 177574: calls RhpHandleAlloc. Handle = 0x1015a2860, pStorage = 0x102007ee0
AllocateThreadStaticStorageForType: typeTlsIndex=0x7MT=0x10089a478
ManagedThreadId.MakeForCurrentThread called
GetThreadStaticStorageForModule 177574 threadStaticsStorageHandle = 0x1015a2860
GetThreadStaticStorageForModule 177574 *threadStaticsStorageHandle = 0x102007ee0
CreateHandle: 0x1015a2858, type=2
SetThreadStaticStorageForModule 177574: calls RhpHandleAlloc. Handle = 0x1015a2858, pStorage = 0x10200e000
SetThreadStaticStorageForModule 177574: calls RhHandleFree on 0x1015a2860
DestroyHandle: *0x1015a2860->0x102007ee0
AllocateThreadStaticStorageForType: typeTlsIndex=0x15MT=0x10089a440
GetThreadStaticStorageForModule 177574 threadStaticsStorageHandle = 0x1015a2858
GetThreadStaticStorageForModule 177574 *threadStaticsStorageHandle = 0x10200e000
==
GC1:
*0x10200e000*, pMT=0x100875cd1, newly marked: 1 //vtable for __Array<Object> + 1
*0x1020206d0*, pMT=0x10089a4d9, newly marked: 1 // __GCStaticEEType_001 + 1
*0x102020720*, pMT=0x10081ea19, newly marked: 1 //vtable for S_P_TypeLoader_System_Collections_Generic_LowLevelDictionary_2<S_P_CoreLib_System_Runtime_TypeManagerHandle__S_P_TypeLoader_Internal_NativeFormat_NativeReader> + 1
*0x102020740*, pMT=0x10087cb49, newly marked: 1
*0x102020830*, pMT=0x100833979, newly marked: 1
*0x1020207f8*, pMT=0x10080af69, newly marked: 0
*0x102007f38*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x102007f80*, pMT=0x10080d369, newly marked: 1 //vtable for S_P_CoreLib_System_Threading_ManagedThreadId + 1
*0x10200fd28*, pMT=0x10089a501, newly marked: 1
*0x10200fd78*, pMT=0x10089a479, newly marked: 1
*0x10200fd90*, pMT=0x100813161, newly marked: 1
*0x102019d00*, pMT=0x1008772a1, newly marked: 1
*0x10200e0c8*, pMT=0x10089a441, newly marked: 1
GCHeap::Promote: Promote GC Root *0x1015a2858 = 0x10200e000 MT = 0x100875cd0T
==
CreateHandle: 0x1015a2850, type=2
SetThreadStaticStorageForModule 177574: calls RhpHandleAlloc. Handle = 0x1015a2850, pStorage = 0x10202a008
SetThreadStaticStorageForModule 177574: calls RhHandleFree on 0x1015a2858
DestroyHandle: *0x1015a2858->0x10200e000
AllocateThreadStaticStorageForType: typeTlsIndex=0x16MT=0x10089a478
GetThreadStaticStorageForModule 177574 threadStaticsStorageHandle = 0x1015a2850
GetThreadStaticStorageForModule 177574 *threadStaticsStorageHandle = 0x10202a008
CreateHandle: 0x1015a2848, type=2
SetThreadStaticStorageForModule 177574: calls RhpHandleAlloc. Handle = 0x1015a2848, pStorage = 0x10205ed98
SetThreadStaticStorageForModule 177574: calls RhHandleFree on 0x1015a2850
DestroyHandle: *0x1015a2850->0x10202a008
AllocateThreadStaticStorageForType: typeTlsIndex=0x18MT=0x10089a478
GetThreadStaticStorageForModule 177574 threadStaticsStorageHandle = 0x1015a2848
GetThreadStaticStorageForModule 177574 *threadStaticsStorageHandle = 0x10205ed98
==
GC2:
GcScanHandles (Promotion Phase = 1)
*0x10205ed98*, pMT=0x100875cd1, newly marked: 1 // vtable for __Array<Object> + 1
*0x102056348*, pMT=0x10089a479, newly marked: 1 // __GCStaticEEType_01 + 1
*0x102056360*, pMT=0x1007f94c1, newly marked: 0 // vtable for System_Threading_Thread_System_Threading_Thread + 1
*0x1020507a0*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x102053550*, pMT=0x10089a441, newly marked: 1 //__GCStaticEEType_0 + 1
*0x102044698*, pMT=0x10089a489, newly marked: 1 //__GCStaticEEType_000 + 1
*0x10202a0d8*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x10202a118*, pMT=0x100819909, newly marked: 0 //vtable for S_P_CoreLib_Internal_Runtime_Augments_RuntimeThread + 1
*0x1020611a0*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x1020622b8*, pMT=0x1008802c1, newly marked: 1 //vtable for __Array<__Array<UInt8>> + 1
*0x10212bae8*, pMT=0x100877201, newly marked: 1 //vtable for __Array<UInt8> + 1
*0x102062088*, pMT=0x100877201, newly marked: 1 //vtable for __Array<UInt8> + 1
*0x102061338*, pMT=0x100877201, newly marked: 1 //vtable for __Array<UInt8> + 1
*0x10205ee78*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x102126148*, pMT=0x100880381, newly marked: 1 //vtable for __Array<__Array<Char>> + 1
*0x10205f010*, pMT=0x1008772a1, newly marked: 1 //vtable for __Array<Char> + 1
GCHeap::Promote: Promote GC Root *0x1015a2848 = 0x10205ed98 MT = 0x100875cd0T
===
GC3:
*0x1020336b0*, pMT=0x100875cd1, newly marked: 1 //vtable for __Array<Object> + 1
*0x102032198*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x1020321b0*, pMT=0x1007f94c1, newly marked: 0 //vtable for System_Threading_Thread_System_Threading_Thread + 1
*0x1020206d0*, pMT=0x10089a4d9, newly marked: 1 //__GCStaticEEType_001 + 1
*0x102020720*, pMT=0x10081ea19, newly marked: 1
*0x102020740*, pMT=0x10087cb49, newly marked: 1
*0x102020830*, pMT=0x100833979, newly marked: 1
*0x1020207f8*, pMT=0x10080af69, newly marked: 1
*0x102007f38*, pMT=0x10089a479, newly marked: 1 //__GCStaticEEType_01 + 1
*0x102007f80*, pMT=0x10080d369, newly marked: 0 //vtable for S_P_CoreLib_System_Threading_ManagedThreadId + 1
*0x1020302a8*, pMT=0x10089a479, newly marked: 1
*0x10200fd28*, pMT=0x10089a501, newly marked: 1
*0x102031e88*, pMT=0x10089a441, newly marked: 1
*0x10200fd78*, pMT=0x10089a479, newly marked: 1
*0x10200fd90*, pMT=0x100813161, newly marked: 1
*0x1020bf2e0*, pMT=0x1008772a1, newly marked: 1
*0x10202efc8*, pMT=0x10089a489, newly marked: 1
*0x10200e0c8*, pMT=0x10089a441, newly marked: 1
*0x102008d10*, pMT=0x10089a479, newly marked: 1
*0x102008ff0*, pMT=0x100819909, newly marked: 0
*0x102035a58*, pMT=0x10089a479, newly marked: 1
*0x1020366b0*, pMT=0x1008802c1, newly marked: 1
*0x102033790*, pMT=0x10089a479, newly marked: 1
*0x1020bc6c8*, pMT=0x100880381, newly marked: 1
GCHeap::Promote: Promote GC Root *0x1015a2848 = 0x1020336b0 MT = 0x100875cd0T
the handle gets destroyed and recreated:
I have fixed this inefficiency in #6085. (I do not expect to fix it this issue though.)
Looks like it sets newly marked = 0 for the 3rd GC for ManagedThreadId.
Is there something in log for the 3rd GC that has newly marked = 1
for this ManagedThreadId?
I rebased my branch off of newest tip including #6085 It doesn't fix the issue, but nice that it's not needed to re alloc. https://github.com/dotnet/corert/compare/master...christianscheuer:stresslog
You're right, there is a newly marked: 1
for the ManagedThreadId once in the 3rd GC :
It's during scanning of a simple method in the algorithm at 0x1003caed2
, that just calls some other virtual/nonvirtual instance methods on the same class, but then has been instrumented with code like:
Log.WriteLine("PM.SavePage 1: ", Thread.CurrentThread.ManagedThreadId == Environment.CurrentManagedThreadId);
This is the output from the 3rd GC scan of that method. This precedes the newly marked: 0
that occurs for the thread statics handle.
Scanning method 0x1003caed2K
*0x1030321b0*, pMT=0x1007f9551, newly marked: 1 // vtable for System_Threading_Thread_System_Threading_Thread + 1
*0x103008ff0*, pMT=0x100819959, newly marked: 1 // vtable for S_P_CoreLib_Internal_Runtime_Augments_RuntimeThread + 1
*0x103007f80*, pMT=0x10080d3b9, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_ManagedThreadId + 1
*0x103009098*, pMT=0x10080d471, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_Lock + 1
*0x103009060*, pMT=0x1008750d1, newly marked: 1 // vtable for __Array<S_P_CoreLib_Microsoft_Win32_SafeHandles_SafeWaitHandle> + 1
*0x103009348*, pMT=0x10080ed89, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_ManualResetEvent + 1
*0x1030090c0*, pMT=0x10081a5e9, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo + 1
*0x103008ff0*, pMT=0x100819959, newly marked: 0 // vtable for S_P_CoreLib_Internal_Runtime_Augments_RuntimeThread + 1
*0x103009110*, pMT=0x10080d871, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_LowLevelMonitor + 1
*0x103009130*, pMT=0x100879d89, newly marked: 1 // vtable for __Array<S_P_CoreLib_System_Threading_WaitSubsystem_WaitableObject> + 1
*0x103009168*, pMT=0x10087a419, newly marked: 1 // vtable for __Array<S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo_WaitedListNode> + 1
*0x1030091a0*, pMT=0x10081ca69, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo_WaitedListNode + 1
*0x1030091d0*, pMT=0x10081ca69, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo_WaitedListNode + 1
*0x103009200*, pMT=0x10081ca69, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo_WaitedListNode + 1
*0x103009230*, pMT=0x10081ca69, newly marked: 1 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo_WaitedListNode + 1
*0x1030090c0*, pMT=0x10081a5e9, newly marked: 0 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo + 1
*0x1030090c0*, pMT=0x10081a5e9, newly marked: 0 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo + 1
*0x1030090c0*, pMT=0x10081a5e9, newly marked: 0 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo + 1
*0x1030090c0*, pMT=0x10081a5e9, newly marked: 0 // vtable for S_P_CoreLib_System_Threading_WaitSubsystem_ThreadWaitInfo + 1
*0x103009398*, pMT=0x1007f9ad9, newly marked: 1 // vtable for S_P_CoreLib_Microsoft_Win32_SafeHandles_SafeWaitHandle + 1
Disassembly for that method: https://gist.github.com/christianscheuer/91e6a664d6cd93b3e66a6c0f6c503b29
Complete GC3 log: https://gist.github.com/christianscheuer/b88a1f74eddbe7a9f771259889c6c1fe
Everything seems to be marked correctly. I do not see any problem in the log.
export RH_HeapVerify=1
I'm not sure I'm seeing the effects of this.
This env variable is checked at runtime. It turns on extra validation in the runtime/GC (look for GetHeapVerifyLevel
to see details).
Could you please try to run the compiled program with this env variable set under debugger, and set a breakpoint at WKS::gc_heap::verify_heap
to validate that the heap verification is kicking in? It would be useful to know whether the heap verification is identifying any issues.
Also, if you are not doing it already - it is better to use Debug flavor of libRuntime for this. The Debug flavor of libRuntime has extra asserts that may help. (The other parts can be Release.)
Thanks, @jkotas, I'll try to verify that.
Am I using debug flavor of libRuntime if I build CoreRT with ./build.sh debug skiptests
and then point my IlcPath to the resulting ...corert/bin/OSX.x64.Debug/
?
Interestingly, can't seem to set a breakpoint at WKS::gc_heap::verify_heap
without externally looking upnm
and getting the address. Could indicate I don't have all debug symbols. However, I can confirm that my breakpoint at __ZN3WKS7gc_heap11verify_heapEj
does not execute.
Maybe you can see in my steps what I'm doing wrong:
./build.sh clean debug skiptests
export IlcPath=corertdir/bin/OSX.x64.Debug/
dotnet publish -r osx-x64
in my project dirpublish
folderexport RH_HeapVerify=1
lldb myexecutable
nm
)These steps look fine to me. Not sure why it does not work.
I managed to get heap verification working by overriding int EEConfig::GetHeapVerifyLevel()
to always return 1 and manually redefining the dprintf macros like this in gc.cpp
:
#ifdef dprintf
#undef dprintf
#endif
#define dprintf(loglevel, printargs) printf printargs ; printf("\n");
Also did a quick find/replace to replace all %I
(capital i) with %l
(lowercase L) since the %I
was not recognized on macOS.
This produces a host of new data that I have no idea how to parse. Where do I look for heap verification errors?
This is the complete log: https://drive.google.com/open?id=1zCIUfqZpMLQEwStpIndemGaOLp1gQgAH
FWIW, reading the memory at $rax right after the return of __GetThreadStaticBase...ManagedThreadId
gives these two different outputs, the first one is read before any GC, the last one is read right after the 3rd GC.
Before GC: mem read $rax
0x103807f38: c8 64 89 00 01 00 00 00 80 7f 80 03 01 00 00 00 ?d..............
0x103807f48: 01 00 00 00 00 00 00 00 b8 8a 81 00 01 00 00 00 ........?.......
After 3rd GC: mem read $rax
0x1038078e8: c8 64 89 00 01 00 00 00 00 79 80 03 01 00 00 00 ?d.......y......
0x1038078f8: 00 00 00 00 00 00 00 00 b8 93 80 00 01 00 00 00 ........?.......
Point in disassembly:
sfbackend`S_P_CoreLib_System_Threading_ManagedThreadId__get_Current:
0x1002858d8 <+0>: pushq %rbp
0x1002858d9 <+1>: subq $0x10, %rsp
0x1002858dd <+5>: leaq 0x10(%rsp), %rbp
0x1002858e2 <+10>: leaq -0x10(%rbp), %rdi
0x1002858e6 <+14>: movl $0x4, %ecx
0x1002858eb <+19>: xorl %eax, %eax
0x1002858ed <+21>: rep
0x1002858ee <+22>: stosl %eax, %es:(%rdi)
0x1002858ef <+23>: nop
0x1002858f0 <+24>: callq 0x10000a578 ; __GetThreadStaticBase_S_P_CoreLib_System_Threading_ManagedThreadId
-> 0x1002858f5 <+29>: movl 0x10(%rax), %eax
0x1002858f8 <+32>: movl %eax, -0x4(%rbp)
0x1002858fb <+35>: cmpl $0x0, -0x4(%rbp)
0x1002858ff <+39>: sete %al
0x100285902 <+42>: movzbl %al, %eax
0x100285905 <+45>: movl %eax, -0x8(%rbp)
0x100285908 <+48>: cmpl $0x0, -0x8(%rbp)
0x10028590c <+52>: je 0x10028591f ; <+71>
0x10028590e <+54>: callq 0x1002859c4 ; S_P_CoreLib_System_Threading_ManagedThreadId__MakeForCurrentThread
0x100285913 <+59>: movl %eax, -0x10(%rbp)
0x100285916 <+62>: movl -0x10(%rbp), %eax
0x100285919 <+65>: movl %eax, -0xc(%rbp)
0x10028591c <+68>: nop
0x10028591d <+69>: jmp 0x100285928 ; <+80>
0x10028591f <+71>: movl -0x4(%rbp), %eax
0x100285922 <+74>: movl %eax, -0xc(%rbp)
0x100285925 <+77>: nop
0x100285926 <+78>: jmp 0x100285928 ; <+80>
0x100285928 <+80>: movl -0xc(%rbp), %eax
0x10028592b <+83>: leaq (%rbp), %rsp
0x10028592f <+87>: popq %rbp
0x100285930 <+88>: retq
0x100285931 <+89>: nop
0x100285932 <+90>: nop
0x100285933 <+91>: nop
I'm starting to think it's not GC relocations, but the hijacking/restoring of stack and registers that's the problem.
I started to look back at the log. After we made sure everything's flushed, I noticed there's a single True evaluation after each GC, and only the first one after that returns False. see here: https://gist.github.com/christianscheuer/a8e8123d454dccb1c24766054b44f13a
I also noticed some weird debugging behavior that seems important. When after the 3rd GC in the first call to S_P_CoreLib_System_Threading_ManagedThreadId__get_Current
stepping into the call at +24 and keep going instruction by instruction, all of a sudden I find myself returning at +54, which is when it then calls MakeForCurrentThread
.
If, however, I manually set a breakpoint at +29 and just let it continue instead of stepping into the +24 call, I can sometimes bypass the problem, keeping all values intact, even after the 3rd GC and after several calls. It can also happen that I don't return at all into the function, even after putting a breakpoint at +29 and hitting continue.
S_P_CoreLib_System_Threading_ManagedThreadId__get_Current:
0x1002858d8 <+0>: pushq %rbp
0x1002858d9 <+1>: subq $0x10, %rsp
0x1002858dd <+5>: leaq 0x10(%rsp), %rbp
0x1002858e2 <+10>: leaq -0x10(%rbp), %rdi
0x1002858e6 <+14>: movl $0x4, %ecx
0x1002858eb <+19>: xorl %eax, %eax
0x1002858ed <+21>: rep
0x1002858ee <+22>: stosl %eax, %es:(%rdi)
0x1002858ef <+23>: nop
0x1002858f0 <+24>: callq 0x10000a578 ; __GetThreadStaticBase_S_P_CoreLib_System_Threading_ManagedThreadId
0x1002858f5 <+29>: movl 0x10(%rax), %eax
0x1002858f8 <+32>: movl %eax, -0x4(%rbp)
0x1002858fb <+35>: cmpl $0x0, -0x4(%rbp)
0x1002858ff <+39>: sete %al
0x100285902 <+42>: movzbl %al, %eax
0x100285905 <+45>: movl %eax, -0x8(%rbp)
0x100285908 <+48>: cmpl $0x0, -0x8(%rbp)
0x10028590c <+52>: je 0x10028591f ; <+71>
0x10028590e <+54>: callq 0x1002859c4 ; S_P_CoreLib_System_Threading_ManagedThreadId__MakeForCurrentThread
0x100285913 <+59>: movl %eax, -0x10(%rbp)
0x100285916 <+62>: movl -0x10(%rbp), %eax
0x100285919 <+65>: movl %eax, -0xc(%rbp)
0x10028591c <+68>: nop
0x10028591d <+69>: jmp 0x100285928 ; <+80>
0x10028591f <+71>: movl -0x4(%rbp), %eax
0x100285922 <+74>: movl %eax, -0xc(%rbp)
0x100285925 <+77>: nop
0x100285926 <+78>: jmp 0x100285928 ; <+80>
0x100285928 <+80>: movl -0xc(%rbp), %eax
0x10028592b <+83>: leaq (%rbp), %rsp
0x10028592f <+87>: popq %rbp
0x100285930 <+88>: retq
0x100285931 <+89>: nop
0x100285932 <+90>: nop
0x100285933 <+91>: nop
Well what I just described I am now unable to reproduce. Must have been a side effect of something else. Close to giving up here :/
Okay, since our handle is now never changed, I was able to debug further.
By assigning the handle to $handle
I could add display hooks on the following two expressions:
(void**)(((int64_t*) (*(void**)$handle) )[9]+0x10)
*(void**)(((int64_t*) (*(void**)$handle) )[9]+0x10)
This gives me the address of and the value of the thread id as stored inside the static base. The [9] is accessing the object storage[7] (it starts with MT and count), 7 is the typeTlsIndex
for ManagedThreadId
and then I add the 0x10 to get to the managed thread id Int64 value.
So at all times when stopping in the code I can now verify if the handle->object[]->int works.
By then setting a watchpoint on the handle, eg. every time somebody updates the handle's value, I can check to see if after updating the handle, the two results will still evaluate correctly.
Using this approach I'm able to completely verify that during a succesful GC, inside of the plan_phase it gets changed (in Relocate), and it works again after. During an unsuccesful GC, after the plan_phase it will point to an object with 0x0 in its value, eg. the expressions will evaluate to (where normally the second would be a 0x1):
- Hook 5 (expr -- (void**)(((int64_t*) (*(void**)0x11a932860) )[9]+0x10))
(void **) $111 = 0x00000001028078f8
- Hook 6 (expr -- *(void**)(((int64_t*) (*(void**)0x11a932860) )[9]+0x10))
(void *) $112 = 0x0000000000000000
The value at +0x0 instead of +0x10, which normally is __GCStaticEEType_01, is still __GCStaticEEType_01 when it fails. It seems it's just nulled out the actual int value. So could it be that it has an incorrect size for the static EE type and hence doesn't get the int copied with it?
It seems this is actually the answer. By artificially increasing the size of the GCStaticEETypeNode by one pointer, the problem goes completely away.
In https://github.com/dotnet/corert/blob/master/src/ILCompiler.Compiler/src/Compiler/DependencyAnalysis/GCStaticEETypeNode.cs#L73 changing + 1 to + 2 solves it:
int totalSize = (_gcMap.Size + 2) * _target.PointerSize;
https://github.com/dotnet/corert/compare/master...christianscheuer:issue-6041
I don't know if that's the right solution of course, but it does get rid of the issue.
Yes, this is the bug! I have submitted proper fix. Thanks a lot for tracing this down! Well done.
Yay, and I can confirm your fix works here! :) Thanks for all the help, was a nice way to get to know the project a little deeper.
I'm getting a
The write lock is being released without being held
exception when exiting write mode on aReaderWriterLockSlim
on macOS.Debugging this issue has led me to conclude, that apparently my call to
ExitWriteLock
happens on a different thread than where I started, albeit in a weird way.When I debug it,
Thread.CurrentThread.ManagedThreadId
will indicate I'm still on thread 1 where I acquired the lock, butEnvironment.ManagedThreadId
which is whatReaderWriterLockSlim
uses, indicates I'm on thread 6. This does not seem to happen on CoreCLR.The code paths involved do not include any calls to Thread.Start or asynchronous Task code. It really all should be happening on the same thread (which
Thread.CurrentThread.ManagedThreadId
also seems to indicate is true). The rest of the app does have asynchronous Task code though.Unfortunately, I haven't yet been able to make a repro case. It does seem to make a difference though if GC is invoked between the calls to EnterWriteLock and ExitWriteLock. Any ideas as how to better debug this?
This is basically what I'm doing, but I don't think this helps much.