Closed BruceForstall closed 4 years ago
FYI. @fiigii, @CarolEidt for the HWIntrinsic failures...
The HardwareIntrinsics failures should be resolved by https://github.com/dotnet/coreclr/pull/19141
Was trying to debug this locally and I get Consistency check failed: Crst Level violation: Can't take level 9 lock CrstReJITSharedDomainTable because you already holding level 3 lock CrstGCCover
I would guess that either my machine is configured differently or I am doing something fundamentally wrong, given that this doesn't happen on the CI machines -- Is there, potentially, any difference that results from building the tests on the Linux box itself (rather than restoring from a zip)?
Looks like, however, that the Ubuntu failures are actually because of Error: Handle is not initialized
.
This seems to be a fairly sporadic failure, however. For example:
JIT/HardwareIntrinsics/X86/Fma_Vector256/Fma_r/Fma_r.sh
BEGIN EXECUTION
/mnt/j/workspace/dotnet_coreclr/master/jitstress/x64_checked_ubuntu_gcstress0xc_tst/bin/tests/Linux.x64.Checked/Tests/Core_Root/corerun Fma_r.exe
Running MultiplyAdd.Double test...
Running MultiplyAdd.Single test...
Error: Handle is not initialized.
Running MultiplyAddNegated.Double test...
Running MultiplyAddNegated.Single test...
Running MultiplyAddSubtract.Double test...
Running MultiplyAddSubtract.Single test...
Running MultiplySubtract.Double test...
Error: Handle is not initialized.
Running MultiplySubtract.Single test...
Running MultiplySubtractAdd.Double test...
Running MultiplySubtractAdd.Single test...
Running MultiplySubtractNegated.Double test...
Running MultiplySubtractNegated.Single test...
Expected: 100
Actual: 0
END EXECUTION - FAILED
The following tests all use the same template, SimpleTernOpTest.template
:
MultiplyAdd.Double
MultiplyAdd.Single
MultiplyAddNegated.Double
MultiplyAddNegated.Single
MultiplySubtract.Double
MultiplySubtract.Single
MultiplySubtractNegated.Double
MultiplySubtractNegated.Single
However, only two (MultiplyAdd.Single
and MultiplySubtract.Double
) actually fail with the assert.
That particular error message is only thrown by GCHandle
if the handle was never initialized or if it was freed.
However, the GCHandles are created/pinned once in the constructor and are only freed on Dispose
.
Given that these aren't also failing on Windows, however, I wouldn't think there is a problem in the managed code.
fyi, I see that Crst failure in one lab run, https://ci.dot.net/job/dotnet_coreclr/job/master/view/x64/job/jitstress/job/x64_checked_ubuntu_gcstress0xc_flow/95/.
@jkotas @kouvel Who from the VM side should be responsible for making GCStress "clean"?
I do not think we have any catch all person for GCStress. Since the crash is about CrstReJITSharedDomainTable
, it should go to folks who are hacking on ReJIT/tiered JIT.
Seems like pretty much all GC stress is broken now... I'll dig in, but @noahfalk @kouvel can you help?
On basic x86 GCStress=0xc: https://ci.dot.net/job/dotnet_coreclr/job/master/job/jitstress/job/x86_checked_windows_nt_gcstress0xc/
From the list of commits that contributed to that run, this looks suspiciously related: https://github.com/dotnet/coreclr/pull/19054
Wonder if we should have a gc stress smoketest in the innerloop CI? Given that it requires a special package restore and also exercises otherwise uncovered paths in the runtime...
Wonder if we should have a gc stress smoketest in the innerloop CI? Given that it requires a special package restore and also exercises otherwise uncovered paths in the runtime...
Seems like a great idea. (Care to open an issue?)
dotnet/coreclr#19411
None of these tests are failing in the latest run: https://ci.dot.net/job/dotnet_coreclr/job/master/job/jitstress/job/x64_checked_ubuntu_gcstress0xc_tst/102/
I assume existing failures are (or will be) tracked by other issues.
https://ci.dot.net/job/dotnet_coreclr/job/master/view/x64/job/jitstress/job/x64_checked_ubuntu_gcstress0xc_flow/91/