Closed carlossanlop closed 2 days ago
Tagging subscribers to this area: @dotnet/interop-contrib See info in area-owners.md if you want to be subscribed.
Author: | carlossanlop |
---|---|
Assignees: | - |
Labels: | `area-System.Runtime.InteropServices`, `blocking-clean-ci`, `runtime-coreclr`, `source-generator`, `test-failure`, `Known Build Error` |
Milestone: | - |
@jkoritzinsky and @jtschuster This seems a bit odd. The generator is deterministic, so why aren't we seeing this more regularly or in the recent CI? Perhaps this is the non-determinism for A/V, but I'm surprised we haven't seen this before.
The unit tests are all managed code I think. Was this something with us, or something with the runtime?
Was this something with us, or something with the runtime?
That is the question. My initial guess here would be we are generating something bad.
Generating something bad shouldn't cause a segfault unless we run the code, right? And I don't think we run generated code in the unit tests.
And I don't think we run generated code in the unit tests.
Ah. I thought we run some of that code. Okay.
Yeah we don't run any generated code in the unit tests. We only generate the code and then use the Roslyn APIs to inspect it. We only run the code in the "integration" tests (ie LibraryImportGenerator.Tests)
I am seeing this failure affecting also ComInterfaceGenerator.Unit.Tests
, and in release/8.0
. Could it be the same root cause?
Libraries Test Run release coreclr osx x64 Debug
Console log: 'ComInterfaceGenerator.Unit.Tests' from job 66299251-1f7b-4287-a196-733ff30cd7c3 workitem 6d62ae3e-1313-4d21-958a-1f7596b99a7d (osx.1200.amd64.open) executed on machine dci-mac-build-317.local running macOS-12.4
/private/tmp/helix/working/988A08C3/w/A73B0921/e /private/tmp/helix/working/988A08C3/w/A73B0921/e Discovering: ComInterfaceGenerator.Unit.Tests (method display = ClassAndMethod, method display options = None) Discovered: ComInterfaceGenerator.Unit.Tests (found 106 test cases) Starting: ComInterfaceGenerator.Unit.Tests (parallel test collections = on, max threads = 6) ./RunTests.sh: line 168: 74562 Segmentation fault: 11 "$RUNTIME_PATH/dotnet" exec --runtimeconfig ComInterfaceGenerator.Unit.Tests.runtimeconfig.json --depsfile ComInterfaceGenerator.Unit.Tests.deps.json xunit.console.dll ComInterfaceGenerator.Unit.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE /private/tmp/helix/working/988A08C3/w/A73B0921/e ----- end Tue Aug 22 15:14:25 PDT 2023 ----- exit code 139 ---------------------------------------------------------- exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped. ulimit -c value: 0
Yes, it is possible that this is the same failure.
I've successfully got a Native AOT crash dump, with working symbols, on Linux for this. Download https://microsoft-my.sharepoint.com/:u:/p/angocke/Eaj2iJxJzItEgs8mngNfxZ0B081M0ipe2Q9ucGKiK80SFQ?e=hfCpNT for a zip with all the necessary bits
@AaronRobinsonMSFT @jtschuster Could you take a look while Jeremy's out?
@agocke The above link is for a failure in System.Numerics.Vectors.Tests
. Is this really realted to the LibraryImport source generator?
oh, do those tests not use the generor? OK, this must just be catching extra stuff
Adjusted the error message, hopefully this will catch only LibraryImportGenerator segfaults now
removing blocking-clean-ci as it has not failed in 30 days
24-Hour Hit Count | 7-Day Hit Count | 1-Month Count |
---|---|---|
0 | 0 | 0 |
I think it's worth closing this. Whatever was causing the LibraryImportGenerator failures looks more likely to be a runtime instability problem that has been resolved.
Error Blob
Reproduction Steps
main
PR: https://github.com/dotnet/runtime/pull/88280Libraries Test Run release coreclr osx x64 Debug
./RunTests.sh --runtime-path /tmp/helix/working/AB2F0948/p ----- start Mon Jul 17 13:54:31 PDT 2023 =============== To repro directly: ===================================================== pushd . /tmp/helix/working/AB2F0948/p/dotnet exec --runtimeconfig LibraryImportGenerator.Unit.Tests.runtimeconfig.json --depsfile LibraryImportGenerator.Unit.Tests.deps.json xunit.console.dll LibraryImportGenerator.Unit.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing popd
/private/tmp/helix/working/AB2F0948/w/B0D909D1/e /private/tmp/helix/working/AB2F0948/w/B0D909D1/e Discovering: LibraryImportGenerator.Unit.Tests (method display = ClassAndMethod, method display options = None) Discovered: LibraryImportGenerator.Unit.Tests (found 183 of 188 test cases) Starting: LibraryImportGenerator.Unit.Tests (parallel test collections = on, max threads = 6) LibraryImportGenerator.UnitTests.Compiles.ValidateSnippetsWithMarshalType [SKIP] No current scenarios to test. ./RunTests.sh: line 168: 19058 Segmentation fault: 11 "$RUNTIME_PATH/dotnet" exec --runtimeconfig LibraryImportGenerator.Unit.Tests.runtimeconfig.json --depsfile LibraryImportGenerator.Unit.Tests.deps.json xunit.console.dll LibraryImportGenerator.Unit.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE /private/tmp/helix/working/AB2F0948/w/B0D909D1/e ----- end Mon Jul 17 13:55:17 PDT 2023 ----- exit code 139 ---------------------------------------------------------- exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped. ulimit -c value: 0
Report
Summary