Closed RussKeldorph closed 4 years ago
Based on debugging it seems to be a VM issue that VM being not able to properly unwind when there is an AV in VSD_ResolveStub. The repro is flip-flopping in invoking an interface method on two different class objects that implement the same interface. Without gc-stress, catch handler of DoWork() method gets executed and under gc-stress it doesn't get executed.
Further debugged this with @rahku and as per him this could be an issue in EH under gc-stress. Assigning it to @rahku for further debugging.
CC @gkhanna79
EH for x86 has not changed in a long time. Is this something that repros in Desktop JIT32 runs as well?
I don't think this is related to EH. It looks like a gcstress infra issue to me. I am investigating.
@rahku Do you have any update on this? This bug is being tracked for Escrow of RTM.
There is an interface dispatch happening when the target object is NULL. This results in AV in VSD_ResolveStub. Control then transfers to CLRVectoredExceptionHandlerShim as part of exception processing. In the function IsGCMarker at https://github.com/dotnet/coreclr/blob/master/src/vm/excep.cpp#L8480 returns true indicating that we need to perform GCStress at this instruction. This is not correct. This is a real AV and not a GCStress marker. Due to this CLRVectoredExceptionHandlerShim returns EXCEPTION_CONTINUE_EXECUTION https://github.com/dotnet/coreclr/blob/master/src/vm/excep.cpp#L8517. So we hit an AV again at the same instruction. From this point on execution is incorrect and we fail the test. Without GCStress enabled IsGCMarker() does not return true for this AV and so clr processes the AV correctly and the test passes.
So therefore it is a GcStress infra issue. IsGCMarker should not be returning true for a real AV in VSD_ResolveStub. This does not repro on desktop clr through as interface dispatch for that particular instance does not happen via VSD_ResolveStub but some other stub. I am investigating why this difference between desktop & coreclr.
Which check in https://github.com/dotnet/coreclr/blob/master/src/vm/excep.cpp#L7155 is returning true?
If I remember correctly from yesterday's debugging session, I think it is the third condition that is true in failure case:
pThread->GetLastAVAddress() != (LPVOID)GetIP(pContext)
https://github.com/dotnet/coreclr/blob/master/src/vm/excep.cpp#L7159 check should have failed, resulting in false being returned since VSD_ResolveStub is in EE. Is that not the case?
VSD_ResolveStub is dynamically allocated so it is not within in the address space where coreclr is loaded and so IsIPinEE() returns false
I can also repro the issue on desktop clr. Earlier I was trying to repro on ret build where gcstress is ifdef'd out so the issues did not repro for me.
This is gcstress infra issue. IsGCMarker() fails to correctly detect if the current IP is a real AV or fake AV ( See comments here for fake AV https://github.com/dotnet/coreclr/blob/master/src/vm/excep.cpp#L7145)
The issue is not specifically related to the VSD_ResolveStub. This also repros with R2R helper stubs, and with GCStress enabled, a real AV is treated as a GC-marker, and no NullReferenceException is generated. (See dotnet/coreclr#5572 for repro).
Environment Windows x86 (JIT32) set COMPlus_GCStress=C
Output