dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

System.GC.Collect() cannot be called in a Lua/LuaJIT callstack #109475

Open JCash opened 3 hours ago

JCash commented 3 hours ago

Description

When the System.GC.Collect() is called (manually or automatically by runtime), it will produce a crash in StackFrameIterator::CalculateCurrentMethodState() at StackFrameIterator.cpp:1933:9

Reproduction Steps

The repro case is a C/C++ app where a Lua library is implemented in C#. The app calls Lua, which in turn calls the C# implementation. This works fine. The problem is if during those calls, the System.GC.Collect() is called.

The C# is using NativeAOT to produce a static library that is linked with the executable.

Steps to reproduce (macos):

$ ./scripts/compile_external.sh macos
$ ./scripts/compile.sh macos
$ ./build/macos/test

gc-repro.zip

(build scripts should be easy enough to modify for linux as well)

Expected behavior

I expect no error will occur.

Actual behavior

The app throws a SIGABRT.

Regression?

No response

Known Workarounds

No response

Configuration

DotNet 9 rc2 Running arm64 macOS Lua 5.1 (also tested with LuaJIT 2.1)

Other information

Callstack:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x0000000185f3a0dc libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000185f71cc0 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x0000000185e7da40 libsystem_c.dylib`abort + 180
    frame #3: 0x0000000100066bd0 test`RaiseFailFastException(arg1=<unavailable>, arg2=<unavailable>, arg3=<unavailable>) at PalRedhawkUnix.cpp:90:5 [opt]
    frame #4: 0x000000010002a248 test`StackFrameIterator::CalculateCurrentMethodState(this=0x000000016fdfe4b0) at StackFrameIterator.cpp:1933:9 [opt]
    frame #5: 0x000000010002b7e0 test`Thread::GcScanRootsWorker(this=0x0000000103a03ac8, pfnEnumCallback=(test`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:49468), pvCallbackData=0x000000016fdfe7c0, frameIterator=0x000000016fdfe4b0) at thread.cpp:505:27 [opt]
    frame #6: 0x000000010002b648 test`Thread::GcScanRoots(this=0x0000000103a03ac8, pfnEnumCallback=(test`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:49468), pvCallbackData=0x000000016fdfe7c0) at thread.cpp:401:5 [opt]
    frame #7: 0x00000001000274e4 test`GCToEEInterface::GcScanRoots(fn=(test`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:49468), condemned=<unavailable>, max_gen=<unavailable>, sc=0x000000016fdfe7c0) at gcenv.ee.cpp:122:22 [opt]
    frame #8: 0x00000001000409cc test`WKS::gc_heap::mark_phase(condemned_gen_number=2) at gc.cpp:29899:9 [opt]
    frame #9: 0x000000010003dd34 test`WKS::gc_heap::gc1() at gc.cpp:22400:13 [opt]
    frame #10: 0x00000001000474ac test`WKS::gc_heap::garbage_collect(n=<unavailable>) at gc.cpp:0:21 [opt]
    frame #11: 0x0000000100039b68 test`WKS::GCHeap::GarbageCollectGeneration(this=<unavailable>, gen=2, reason=reason_induced) at gc.cpp:51057:9 [opt]
    frame #12: 0x000000010005b524 test`WKS::GCHeap::GarbageCollect(int, bool, int) [inlined] WKS::GCHeap::GarbageCollectTry(this=<unavailable>, generation=<unavailable>, low_memory_p=<unavailable>, mode=<unavailable>) at gc.cpp:50251:12 [opt]
    frame #13: 0x000000010005b518 test`WKS::GCHeap::GarbageCollect(this=<unavailable>, generation=<unavailable>, low_memory_p=false, mode=<unavailable>) at gc.cpp:50181:30 [opt]
    frame #14: 0x0000000100025b04 test`RhpCollect(uGeneration=<unavailable>, uMode=<unavailable>, lowMemoryP=0) at GCHelpers.cpp:108:35 [opt]
    frame #15: 0x00000001000dcd90 test`_S_P_CoreLib_System_Runtime_InternalCalls__RhCollect(generation=<unavailable>, mode=<unavailable>, lowMemoryP=<unavailable>) at InternalCalls.cs:65
    frame #16: 0x0000000100137694 test`_libNativeLibrary_CS__GCCollect(L=<unavailable>) at Library.cs:51
    frame #17: 0x000000010000c7f4 test`luaD_precall(L=0x0000000103303380, func=<unavailable>, nresults=<unavailable>) at ldo.c:320:9 [opt]
    frame #18: 0x000000010001fd34 test`luaV_execute(L=0x0000000103303380, nexeccalls=1) at lvm.c:591:17 [opt]
    frame #19: 0x000000010000cfcc test`luaD_call(L=0x0000000103303380, func=0x0000000127303c10, nResults=-1) at ldo.c:378:5 [opt]
    frame #20: 0x000000010000c1d0 test`luaD_rawrunprotected(L=0x0000000103303380, f=(test`f_call at lapi.c:800:19), ud=0x000000016fdfeec0) at ldo.c:116:3 [opt]
    frame #21: 0x000000010000d2e8 test`luaD_pcall(L=0x0000000103303380, func=<unavailable>, u=<unavailable>, old_top=16, ef=<unavailable>) at ldo.c:464:12 [opt]
    frame #22: 0x0000000100003da4 test`lua_pcall(L=0x0000000103303380, nargs=<unavailable>, nresults=-1, errfunc=<unavailable>) at lapi.c:821:12 [opt]
    frame #23: 0x000000010009988c test`RunString(L=0x0000000103303380, script="cstest.gc_collect()") at main.cpp:14:9 [opt]
    frame #24: 0x0000000100099ac4 test`main(argc=1, argv=0x000000016fdff220) at main.cpp:40:5 [opt]
    frame #25: 0x0000000185bf10e0 dyld`start + 2360
dotnet-policy-service[bot] commented 3 hours ago

Tagging subscribers to this area: @dotnet/gc See info in area-owners.md if you want to be subscribed.