dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.55k stars 4.54k forks source link

[NativeAOT] linux-arm bring up #97729

Open filipnavara opened 5 months ago

filipnavara commented 5 months ago

This is tracking issue for the known problems that need to be resolved to get working NativeAOT support on linux-arm platform.

Known issues:

Failing runtime tests:

Other things requiring clean up:

ghost commented 5 months ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

Issue Details
This is tracking issue for the known problems that need to be resolved to get working NativeAOT support on linux-arm platform. Known issues: - [ ] Inline TLS access emits incorrect code for Optimized=true which results in stack overflow - [ ] Some code paths around Align8 / `FEATURE_64BIT_ALIGNMENT` may be unhandled (ref: https://github.com/dotnet/runtime/pull/97269#issuecomment-1909219996) - [ ] Interlocked JIT tests fail Other things requiring clean up: - [ ] NativeAOT build integration support for linux-musl and linux-bionic on arm32 - [ ] Run the smoke tests to prevent regressions
Author: filipnavara
Assignees: -
Labels: `arch-arm32`, `area-NativeAOT-coreclr`
Milestone: -
filipnavara commented 5 months ago

State as of ba8993fa8c80a663dadd495db17ac9593bdf703b + PRs #97746, #97756 and #97757:

filipnavara commented 5 months ago

I run the smoke tests in Release configuration. Some of them reliably fail which makes the debugging easier. Apparently we now get incorrect answer for InWriteBarrierHelper in the SIGSEGV exception handler. I'll debug it later this week.

--

printf("%x %x %x\r\n", (uintptr_t)&RhpAssignRefAVLocation, (uintptr_t)&RhpAssignRefAVLocation & ~1, faultingIP);
// prints "4b9539 4b9539 4b9538"

Don't you just love compilers? (Technically, clang is not wrong here since RhpAssignRefAVLocation is defined as external variable, not a function)

Workaround: https://github.com/filipnavara/runtime/commit/39ae75f0166542d822d9e68236276bea1f88fcd3 Fix: https://github.com/dotnet/runtime/commit/7d25e4cc800934d891acadfdaa23ae5fb83e165e

NCLnclNCL commented 5 months ago

When support window x86(32 bit) ???

michaldobrodenka commented 5 months ago

When support window x86(32 bit) ??? Why? win-x86 is not suported in Windows 11, only 10 which will be supported only for about a year. Maybe some win iot?

filipnavara commented 5 months ago

When support window x86(32 bit) ???

Please keep this issue on topic. I am doing this in my free time, I do not plan to work on win-x86 port. There's already an open issue for that.

filipnavara commented 5 months ago

With the in-flight PRs I can get most of the smoke tests running in Release mode. There's one remaining issue with unwinding during GC in DynamicGenerics test:

--------------------------------------------------
Debug Assertion Violation

Expression: 'm_pInstance->IsManaged(m_ControlPC) || (m_pPreviousTransitionFrame != NULL && (m_dwFlags & SkipNativeFrames) == 0)'

File: /home/navara/runtime/src/coreclr/nativeaot/Runtime/StackFrameIterator.cpp, Line: 1500
--------------------------------------------------
Process 7527 stopped
* thread #4, name = 'DynamicGenerics', stop reason = signal SIGABRT
    frame #0: 0xf7e499f4 libc.so.6`__pthread_kill_implementation(threadid=4045403008, signo=6, no_tid=<unavailable>) at pthread_kill.c:44:76
(lldb) bt
* thread #4, name = 'DynamicGenerics', stop reason = signal SIGABRT
  * frame #0: 0xf7e499f4 libc.so.6`__pthread_kill_implementation(threadid=4045403008, signo=6, no_tid=<unavailable>) at pthread_kill.c:44:76
    frame #1: 0xf7e01cfc libc.so.6`__GI_raise(sig=6) at raise.c:26:13
    frame #2: 0xf7deb0a0 libc.so.6`__GI_abort at abort.c:79:7
    frame #3: 0x0063436a DynamicGenerics`::RaiseFailFastException(arg1=0x00000000, arg2=0x00000000, arg3=1) at PalRedhawkUnix.cpp:90:5
    frame #4: 0x005d53a8 DynamicGenerics`PalRaiseFailFastException(arg1=0x00000000, arg2=0x00000000, arg3=1) at PalRedhawkFunctions.h:120:5
    frame #5: 0x005d5366 DynamicGenerics`Assert(expr=0x00560948, file=0x00563f83, line_num=1500, message=0x00000000) at rhassert.cpp:32:9
    frame #6: 0x005de012 DynamicGenerics`StackFrameIterator::NextInternal(this=0xf11fe490) at StackFrameIterator.cpp:1500:9
    frame #7: 0x005ddb18 DynamicGenerics`StackFrameIterator::Next(this=0xf11fe490) at StackFrameIterator.cpp:1300:5
    frame #8: 0x005e2da2 DynamicGenerics`Thread::GcScanRootsWorker(this=0xef3fe890, pfnEnumCallback=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), pvCallbackData=0xf11fe7ac, frameIterator=0xf11fe490)(Object**, ScanContext*, unsigned int), ScanContext*, StackFrameIterator&) at thread.cpp:554:27
    frame #9: 0x005e2b28 DynamicGenerics`Thread::GcScanRoots(this=0xef3fe890, pfnEnumCallback=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), pvCallbackData=0xf11fe7ac)(Object**, ScanContext*, unsigned int), ScanContext*) at thread.cpp:413:5
    frame #10: 0x005d7ee6 DynamicGenerics`GCToEEInterface::GcScanRoots(fn=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), condemned=2, max_gen=2, sc=0xf11fe7ac)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcenv.ee.cpp:122:22
    frame #11: 0x0062d326 DynamicGenerics`GCScan::GcScanRoots(fn=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), condemned=2, max_gen=2, sc=0xf11fe7ac)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcscan.cpp:152:5
    frame #12: 0x00602218 DynamicGenerics`WKS::gc_heap::mark_phase(condemned_gen_number=2) at gc.cpp:29214:9
    frame #13: 0x005ff22a DynamicGenerics`WKS::gc_heap::gc1() at gc.cpp:22180:13
    frame #14: 0x0060a398 DynamicGenerics`WKS::gc_heap::garbage_collect(n=2) at gc.cpp:24187:9
    frame #15: 0x005f8cd2 DynamicGenerics`WKS::GCHeap::GarbageCollectGeneration(this=0x008dd5d8, gen=2, reason=reason_induced) at gc.cpp:50291:9
    frame #16: 0x006269a6 DynamicGenerics`WKS::GCHeap::GarbageCollectTry(this=0x008dd5d8, generation=2, low_memory_p=NO, mode=2) at gc.cpp:49514:12
    frame #17: 0x0062688a DynamicGenerics`WKS::GCHeap::GarbageCollect(this=0x008dd5d8, generation=2, low_memory_p=false, mode=2) at gc.cpp:49444:30
    frame #18: 0x005d6c44 DynamicGenerics`::RhpCollect(uGeneration=4294967295, uMode=2, lowMemoryP=0) at GCHelpers.cpp:108:35
    frame #19: 0x006fbafc DynamicGenerics`System.Runtime.InternalCalls__RhCollect(generation=<unavailable>, mode=<unavailable>, lowMemoryP=<unavailable>) at InternalCalls.cs:65
    frame #20: 0x0066ac7a DynamicGenerics`DynamicGenerics_ThreadLocalStatics_TLSTesting___c__DisplayClass3_0___MultiThreaded_Test_b__0(this=0xf4e325bc) at threadstatics.cs:464
    frame #21: 0x006e21fc DynamicGenerics`System.Threading.ExecutionContext__RunFromThreadPoolDispatchLoop(threadPoolThread=0xf4e331a4, executionContext=<unavailable>, callback=<unavailable>, state=<unavailable>) at ExecutionContext.cs:264
    frame #22: 0x006e782e DynamicGenerics`System.Threading.Tasks.Task__ExecuteWithThreadLocal(this=0xf4e32660, currentTaskSlot=0xf4e336c4, threadPoolThread=<unavailable>) at Task.cs:2345
    frame #23: 0x006e4a30 DynamicGenerics`System.Threading.ThreadPoolWorkQueue__Dispatch at ThreadPoolWorkQueue.cs:913
    frame #24: 0x007299cc DynamicGenerics`System.Threading.PortableThreadPool_WorkerThread__WorkerThreadStart at PortableThreadPool.WorkerThread.NonBrowser.cs:102
    frame #25: 0x006e0aaa DynamicGenerics`System.Threading.Thread__StartThread(parameter=<unavailable>) at Thread.NativeAot.cs:448
    frame #26: 0x006e0e90 DynamicGenerics`System.Threading.Thread__ThreadEntryPoint(parameter=<unavailable>) at Thread.NativeAot.Unix.cs:114
    frame #27: 0xf7e478e0 libc.so.6`start_thread(arg=0xf11ff380) at pthread_create.c:442:8
    frame #28: 0xf7ec6a1c libc.so.6 at clone.S:74

The tests pass with DOTNET_gcConservative=1.

NCLnclNCL commented 5 months ago

When support window x86(32 bit) ??? Why? win-x86 is not suported in Windows 11, only 10 which will be supported only for about a year. Maybe some win iot?

Window 64 bit can run 32 bit application and i need it to run it in old device or hooking for application 32 bit

filipnavara commented 5 months ago

97863 fixes the unwinding issue in Release builds above. The test still crashes in pure Release configuration though. It passes when the Release DynamicGenerics.o is linked against Debug libRuntime.WorkstationGC.a. I suspect there's still some lurking bug with clearing the Thumb bit in optimized clang code.

Stack trace:

* thread #1, name = 'DynamicGenerics', stop reason = signal SIGSEGV
    frame #0: 0x0086d6fa DynamicGenerics`WKS::GCHeap::Promote(ppObject=0x00000000, sc=<unavailable>, flags=0) at gc.cpp:48753:28 [opt]
  * frame #1: 0x008887de DynamicGenerics`GcInfoDecoder::EnumerateLiveSlots(REGDISPLAY*, bool, unsigned int, void (*)(void*, void**, unsigned int), void*) [inlined] GcInfoDecoder::ReportSlotToGC(this=0xef1fd770, slotDecoder=0xef1fd418, slotIndex=10, pRD=0xef1fd878, reportScratchSlots=true, inputFlags=1, pCallBack=<unavailable>, hCallBack=<unavailable>)(void*, void**, unsigned int), void*) at gcinfodecoder.cpp:0 [opt]
    frame #2: 0x008887be DynamicGenerics`GcInfoDecoder::EnumerateLiveSlots(this=0xef1fd770, pRD=0xef1fd878, reportScratchSlots=true, inputFlags=1, pCallBack=(DynamicGenerics`EnumGcRefsCallback(void*, void**, unsigned int) + 1 at GcEnum.cpp:119), hCallBack=0xef1fd7f0)(void*, void**, unsigned int), void*) at gcinfodecoder.cpp:1020:21 [opt]
    frame #3: 0x0088a2be DynamicGenerics`UnixNativeCodeManager::EnumGcRefs(this=<unavailable>, pMethodInfo=<unavailable>, safePointAddress=<unavailable>, pRegisterSet=<unavailable>, hCallback=0xef1fd7f0, isActiveStackFrame=true) at UnixNativeCodeManager.cpp:242:18 [opt]
    frame #4: 0x00853890 DynamicGenerics`EnumGcRefs(pCodeManager=<unavailable>, pMethodInfo=<unavailable>, safePointAddress=<unavailable>, pRegisterSet=<unavailable>, pfnEnumCallback=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), pvCallbackData=0xef1fda30, isActiveStackFrame=<unavailable>)(Object**, ScanContext*, unsigned int), ScanContext*, bool) at GcEnum.cpp:139:19 [opt]
    frame #5: 0x00857280 DynamicGenerics`Thread::GcScanRootsWorker(this=0xf05ff890, pfnEnumCallback=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), pvCallbackData=0xef1fda30, frameIterator=0xef1fd868)(Object**, ScanContext*, unsigned int), ScanContext*, StackFrameIterator&) at thread.cpp:523:17 [opt]
    frame #6: 0x0085706e DynamicGenerics`Thread::GcScanRoots(this=0xf05ff890, pfnEnumCallback=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), pvCallbackData=0xef1fda30)(Object**, ScanContext*, unsigned int), ScanContext*) at thread.cpp:413:5 [opt]
    frame #7: 0x008530ca DynamicGenerics`GCToEEInterface::GcScanRoots(fn=(DynamicGenerics`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) + 1 at gc.cpp:48747), condemned=<unavailable>, max_gen=<unavailable>, sc=0xef1fda30)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcenv.ee.cpp:122:22 [opt]
    frame #8: 0x008660cc DynamicGenerics`WKS::gc_heap::mark_phase(condemned_gen_number=2) at gc.cpp:29214:9 [opt]
    frame #9: 0x00863d96 DynamicGenerics`WKS::gc_heap::gc1() at gc.cpp:22180:13 [opt]
    frame #10: 0x0086b18c DynamicGenerics`WKS::gc_heap::garbage_collect(n=<unavailable>) at gc.cpp:0 [opt]
    frame #11: 0x0086088c DynamicGenerics`WKS::GCHeap::GarbageCollectGeneration(this=<unavailable>, gen=2, reason=reason_induced) at gc.cpp:50291:9 [opt]
    frame #12: 0x0087a754 DynamicGenerics`WKS::GCHeap::GarbageCollect(int, bool, int) [inlined] WKS::GCHeap::GarbageCollectTry(this=<unavailable>, generation=<unavailable>, low_memory_p=<unavailable>, mode=<unavailable>) at gc.cpp:49514:12 [opt]
    frame #13: 0x0087a74c DynamicGenerics`WKS::GCHeap::GarbageCollect(this=<unavailable>, generation=2, low_memory_p=<unavailable>, mode=<unavailable>) at gc.cpp:49444:30 [opt]
    frame #14: 0x00852780 DynamicGenerics`::RhpCollect(uGeneration=<unavailable>, uMode=<unavailable>, lowMemoryP=<unavailable>) at GCHelpers.cpp:108:35 [opt]
    frame #15: 0x00942a3c DynamicGenerics`System.Runtime.InternalCalls__RhCollect(generation=<unavailable>, mode=<unavailable>, lowMemoryP=<unavailable>) at InternalCalls.cs:65
    frame #16: 0x008b1b80 DynamicGenerics`DynamicGenerics_ThreadLocalStatics_TLSTesting___c__DisplayClass3_0___MultiThreaded_Test_b__0(this=0xf4c41abc) at threadstatics.cs:466
    frame #17: 0x0092916c DynamicGenerics`System.Threading.ExecutionContext__RunFromThreadPoolDispatchLoop(threadPoolThread=0xf4c43830, executionContext=<unavailable>, callback=<unavailable>, state=<unavailable>) at ExecutionContext.cs:264
    frame #18: 0x0092e79e DynamicGenerics`System.Threading.Tasks.Task__ExecuteWithThreadLocal(this=0xf4c42934, currentTaskSlot=0xf4c43b20, threadPoolThread=<unavailable>) at Task.cs:2345
    frame #19: 0x0092b9a0 DynamicGenerics`System.Threading.ThreadPoolWorkQueue__Dispatch at ThreadPoolWorkQueue.cs:913
    frame #20: 0x0097075c DynamicGenerics`System.Threading.PortableThreadPool_WorkerThread__WorkerThreadStart at PortableThreadPool.WorkerThread.NonBrowser.cs:102
    frame #21: 0x00927a1a DynamicGenerics`System.Threading.Thread__StartThread(parameter=<unavailable>) at Thread.NativeAot.cs:448
    frame #22: 0x00927e00 DynamicGenerics`System.Threading.Thread__ThreadEntryPoint(parameter=<unavailable>) at Thread.NativeAot.Unix.cs:114
    frame #23: 0xf7c578e0 libc.so.6`start_thread(arg=0xef1fe380) at pthread_create.c:442:8
    frame #24: 0xf7cd6a1c libc.so.6 at clone.S:74

  thread #7, stop reason = signal 0
    frame #0: 0xf7c53cc8 libc.so.6`__futex_abstimed_wait_common at futex-internal.c:40:12
    frame #1: 0xf7c53cac libc.so.6`__futex_abstimed_wait_common(futex_word=0x00d68bb0, expected=0, clockid=<unavailable>, abstime=<unavailable>, private=0, cancel=true) at futex-internal.c:99:11
    frame #2: 0xf7c53e20 libc.so.6`__GI___futex_abstimed_wait_cancelable64(futex_word=<unavailable>, expected=<unavailable>, clockid=<unavailable>, abstime=<unavailable>, private=0) at futex-internal.c:139:10
    frame #3: 0xf7c56eb8 libc.so.6`___pthread_cond_wait at pthread_cond_wait.c:503:10
    frame #4: 0xf7c56d84 libc.so.6`___pthread_cond_wait(cond=0x00d68b88, mutex=0x00d68bb0) at pthread_cond_wait.c:618:10
    frame #5: 0x008855ba DynamicGenerics`GCEvent::Impl::Wait(this=0x00d68b88, milliseconds=<unavailable>, alertable=<unavailable>) at events.cpp:149:22 [opt]
    frame #6: 0x00857498 DynamicGenerics`Thread::InlineSuspend(UNIX_CONTEXT*) [inlined] Thread::WaitForGC(this=0xf05ff890, pTransitionFrame=<unavailable>) at thread.cpp:84:39 [opt]
    frame #7: 0x0085746a DynamicGenerics`Thread::InlineSuspend(this=0xf05ff890, interruptedContext=<unavailable>) at thread.cpp:878:5 [opt]
    frame #8: 0x00883c5a DynamicGenerics`ActivationHandler(code=34, siginfo=0xf05fe690, context=0xf05fe710) at PalRedhawkUnix.cpp:1008:9 [opt]
    frame #9: 0xf7c13280 libc.so.6 at sigrestorer.S:77
    frame #10: 0x009b2c56 DynamicGenerics`System.Collections.Concurrent.ConcurrentUnifierW`2_Container<System.Reflection.Runtime.TypeInfos.NativeFormat.NativeFormatRuntimeNamedTypeInfo_UnificationKey__System___Canon>__TryGetValue(this=<unavailable>, key=<unavailable>, hashCode=<unavailable>, value=0xf05fea34) at ConcurrentUnifierW.cs:185
    frame #11: 0x009b2aa4 DynamicGenerics`System.Collections.Concurrent.ConcurrentUnifierW`2<System.Reflection.Runtime.TypeInfos.NativeFormat.NativeFormatRuntimeNamedTypeInfo_UnificationKey__System___Canon>__GetOrAdd(this=0xf4be7090, key=System.Reflection.Runtime.TypeInfos.NativeFormat.NativeFormatRuntimeNamedTypeInfo_UnificationKey @ 0xf05fea5c) at ConcurrentUnifierW.cs:119
    frame #12: 0x0095c5d2 DynamicGenerics`System.Reflection.Runtime.TypeInfos.NativeFormat.NativeFormatRuntimeNamedTypeInfo__GetRuntimeNamedTypeInfo(metadataReader=<unavailable>, typeDefHandle=<unavailable>, precomputedTypeHandle=<unavailable>) at TypeUnifier.NativeFormat.cs:79
    frame #13: 0x00954a9c DynamicGenerics`System.Reflection.Runtime.General.TypeResolver__TryResolve_0(typeDefRefOrSpec=<unavailable>, reader=<unavailable>, typeContext=<unavailable>, exception=<unavailable>) at TypeResolver.NativeFormat.cs:34
    frame #14: 0x00954a48 DynamicGenerics`System.Reflection.Runtime.General.TypeResolver__Resolve_1(typeDefRefOrSpec=<unavailable>, reader=<unavailable>, typeContext=<unavailable>) at TypeResolver.NativeFormat.cs:24
    frame #15: 0x0095592e DynamicGenerics`System.Reflection.Runtime.FieldInfos.NativeFormat.NativeFormatRuntimeFieldInfo__get_FieldRuntimeType(this=<unavailable>) at NativeFormatRuntimeFieldInfo.cs:152
    frame #16: 0x0095540a DynamicGenerics`System.Reflection.Runtime.FieldInfos.RuntimeFieldInfo__get_FieldType(this=0xf4c4c70c) at RuntimeFieldInfo.cs:88
    frame #17: 0x009558d8 DynamicGenerics`System.Reflection.Runtime.FieldInfos.NativeFormat.NativeFormatRuntimeFieldInfo__TryGetFieldAccessor(this=0xf4c4c70c) at NativeFormatRuntimeFieldInfo.cs:144
    frame #18: 0x009555ae DynamicGenerics`System.Reflection.Runtime.FieldInfos.RuntimeFieldInfo__get_FieldAccessor(this=0xf4c4c70c) at RuntimeFieldInfo.cs:214
    frame #19: 0x0095543e DynamicGenerics`System.Reflection.Runtime.FieldInfos.RuntimeFieldInfo__GetValue(this=<unavailable>, obj=0x00000000) at RuntimeFieldInfo.cs:102
    frame #20: 0x008a3b10 DynamicGenerics`DynamicGenerics_ThreadLocalStatics_TLSTesting__MakeType1(typeArg=<unavailable>, checkInitialization=<unavailable>) at threadstatics.cs:310
    frame #21: 0x008b1b5e DynamicGenerics`DynamicGenerics_ThreadLocalStatics_TLSTesting___c__DisplayClass3_0___MultiThreaded_Test_b__0(this=0xf4c41abc) at threadstatics.cs:463
    frame #22: 0x0092916c DynamicGenerics`System.Threading.ExecutionContext__RunFromThreadPoolDispatchLoop(threadPoolThread=0xf4c42f90, executionContext=<unavailable>, callback=<unavailable>, state=<unavailable>) at ExecutionContext.cs:264
    frame #23: 0x0092e79e DynamicGenerics`System.Threading.Tasks.Task__ExecuteWithThreadLocal(this=0xf4c428ec, currentTaskSlot=0xf4c43660, threadPoolThread=<unavailable>) at Task.cs:2345
    frame #24: 0x0092b9a0 DynamicGenerics`System.Threading.ThreadPoolWorkQueue__Dispatch at ThreadPoolWorkQueue.cs:913
    frame #25: 0x0097075c DynamicGenerics`System.Threading.PortableThreadPool_WorkerThread__WorkerThreadStart at PortableThreadPool.WorkerThread.NonBrowser.cs:102
    frame #26: 0x00927a1a DynamicGenerics`System.Threading.Thread__StartThread(parameter=<unavailable>) at Thread.NativeAot.cs:448
    frame #27: 0x00927e00 DynamicGenerics`System.Threading.Thread__ThreadEntryPoint(parameter=<unavailable>) at Thread.NativeAot.Unix.cs:114
    frame #28: 0xf7c578e0 libc.so.6`start_thread(arg=0xf05ff380) at pthread_create.c:442:8
    frame #29: 0xf7cd6a1c libc.so.6 at clone.S:74
filipnavara commented 5 months ago

So, for the last crash in GC suspension I may need some help with verifying some assumptions. I can easily reproduce it and it's happening at the same point in the same function:

Decoding the GCInfo on the GC thread indeed shows that there's a live variable in register R12, and since it's a scratch register, regDisplay->pR12 == NULL, which in turn causes the crash.

(cc @VSadov)

jkotas commented 5 months ago

Safe points should be only created for call returns. Scratch registers cannot be live at call returns. It is why they are not handled for safe point. (@VSadov is changing some of these invariants in #95565.)

What does the code around the safe point look like? It may be useful to generate JIT dump for the method in question to see why the JIT decided to emit the safe point at this spot.

filipnavara commented 5 months ago

What does the code around the safe point look like? It may be useful to generate JIT dump for the method in question to see why the JIT decided to emit the safe point at this spot.

It's this code (the offsets are one-off, ie. +85 is really +86):

    0xd92730 <+63>: bhs    0x3227d2                  ; <+225> at ConcurrentUnifierW.cs:199
    0xd92732 <+65>: mov.w  lr, #0x18
    0xd92736 <+69>: mul    lr, r3, lr
    0xd9273a <+73>: add.w  lr, lr, #0x8
    0xd9273e <+77>: add    r1, lr
    0xd92740 <+79>: mov    lr, r1
    0xd92742 <+81>: ldr.w  r12, [lr, #0xc]
-- SAFEPOINT HERE? --
    0xd92746 <+85>: ldr.w  lr, [lr, #0x10]
    0xd9274a <+89>: ldr    r4, [r0, #0x4]
    0xd9274c <+91>: cmp    r4, lr
    0xd9274e <+93>: bne    0x32275a                  ; <+105> at ConcurrentUnifierW.cs:195
    0xd92750 <+95>:  ldr    r0, [r0]
    0xd92752 <+97>:  ldrsb.w lr, [r0]
    0xd92756 <+101>: cmp    r0, r12
    0xd92758 <+103>: beq    0x322770                  ; <+127> at ConcurrentUnifierW.cs:199
    0xd9275a <+105>: ldr    r3, [r1, #0x8]
    0xd9275c <+107>: cmp.w  r3, #0xffffffff
    0xd92760 <+111>: bne    0x322726                  ; <+53> at ConcurrentUnifierW.cs:185
    0xd92762 <+113>: movs   r0, #0x0
    0xd92764 <+115>: ldr    r4, [sp, #0x24]
    0xd92766 <+117>: str    r0, [r4]
    0xd92768 <+119>: pop.w  {r4, r5, r6, r11, lr}
    0xd9276c <+123>: add    sp, #0xc

My suspicion is that the IsSafepoint answer is already wrong. I can dump the GCInfo to verify what exactly is in there but I wanted to be sure that's the right direction to look into.

VSadov commented 5 months ago

One thread gets interrupted at

Right now threads can only be interrupted in interruptible code. Volatile registers can contain GC refs there.

Also threads can self-interrupt when hitting a hijacked return.Volatile regs are dead, but return registers may contain live GC refs.

After that it is unwinding through return sites when returns did not happen yet. Volatile registers are dead.

Since there are no calls around the interruption location in you sample, it is must be in fully interruptible method.

filipnavara commented 5 months ago

The crash happens only when the C runtime part is built with optimizations, so most likely there's an issue with decoding the GC info (the compiler is very eager to optimize out alignments when wrong pointer type is used). I'll dump the GC info.

Since there are no calls around the interruption location in you sample, it is must be in fully interruptible method.

I really hope it's just misdecoded GC info... because a fully interruptible method should not use the R12 register (or we would need to save it from the frame which is trivial).

VSadov commented 5 months ago

The part that C optimizations matter is suspicious indeed.

For the r12 register i do not recall if its use for scratch is forbidden (can’t easily check that right now).
Typically a register set for a leaf frame includes scratch registers, since leaf frames report them to gc. If we are not on a call return, we are probably in a leaf.

jkotas commented 5 months ago

a fully interruptible method should not use the R12 register (or we would need to save it from the frame which is trivial).

It is fine for fully interruptible methods to use R12 register to store GC reference. I think it is a bug that it is not initialized in the REGDISPLAY in StackFrameIterator::InternalInit(Thread * pThreadToWalk, NATIVE_CONTEXT* pCtx, uint32_t dwFlags).

filipnavara commented 5 months ago

Thanks. I came to the same conclusion. The GCInfo shows that it's fully interruptible method.

I'll send a PR.

filipnavara commented 5 months ago

State as of df0778dc9eb9f15c9270ba1a09d475253018e824:

filipnavara commented 5 months ago

With #97917 and #97919 the System.Runtime.Tests get quite far:

  [FAIL] System.Tests.TimeSpanTests.Division(timeSpan: 366.00:00:00, factor: -2.7182818284590451, expected: -994.21:23:15.2922633)
  Assert.Equal() Failure: Values are not within 14 decimal places
  Expected: -2.71828182845905 (rounded from -2.7182818284590451)
  Actual:   -2.7182818284590402 (rounded from -2.7182818284590446)
     at System.Tests.TimeSpanTests.Division(TimeSpan timeSpan, Double factor, TimeSpan expected) + 0x39
     at System.Runtime!<BaseAddress>+0x23bcff0
     at System.Reflection.DynamicInvokeInfo.InvokeWithFewArguments(IntPtr, Byte&, Byte&, Object[], BinderBundle, Boolean) + 0x55
  [FAIL] System.Tests.TimeSpanTests.NamedDivision(timeSpan: 366.00:00:00, factor: -2.7182818284590451, expected: -994.21:23:15.2922633)
  Assert.Equal() Failure: Values are not within 14 decimal places
  Expected: -2.71828182845905 (rounded from -2.7182818284590451)
  Actual:   -2.7182818284590402 (rounded from -2.7182818284590446)
     at System.Tests.TimeSpanTests.NamedDivision(TimeSpan timeSpan, Double factor, TimeSpan expected) + 0x39
     at System.Runtime!<BaseAddress>+0x23bcff0
     at System.Reflection.DynamicInvokeInfo.InvokeWithFewArguments(IntPtr, Byte&, Byte&, Object[], BinderBundle, Boolean) + 0x55
  Finished System.Runtime.Tests, Version=9.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51

  Tests run: 63505, Errors: 0, Failures: 2, Skipped: 127. Time: 30.8337846s

The failures are caused by RhpFltRound/RhpDblRound using round/roundf when .NET needs roundeven/roundevenf. Unfortunately, these are new in C23, so we probably need to use nearbyint/nearbyintf and correct mode. (UPD: Filed as #97922)

There may still be some flaky errors. (UPD: Resolved by fc05277d263b7c86a30390b72eec00c0efe6986a in #97919)

filipnavara commented 5 months ago

With #97964 we pass the NativeAOT smoke tests and System.Runtime.Tests.

filipnavara commented 5 months ago

I am slowly running out of things to fix. Would it make sense to add CI pipeline to prevent build breaks? If so, which one would be the right one? There are some existing pipelines for smoke tests, runtime-extra-platforms, nativeaot-outerloop, and runtime-community. Build systems are not really my strong area but it would be nice to have this. There's at least one open PR that would break the build due to missing platform check, and if we can avoid these breaks (semi-)automatically it would be welcome.

MichalStrehovsky commented 5 months ago

nativeaot-outerloop would work. I can have a look at that later today if you'd rather avoid the yaml.

I think we can set up official packaging as well. If there are no crazy problems we cannot solve, we can then bump linux-arm and linux-musl-arm into officially supported category in .NET 9.

filipnavara commented 5 months ago

I can have a look at that later today if you'd rather avoid the yaml.

I would definitely appreciate help on that.

filipnavara commented 4 months ago

Remaining runtime tests that need fixing:

Notes:

filipnavara commented 4 months ago

The failure in JIT/Directed/Directed_1 is instance of issue #95517. JIT generates the following code:

boxunboxvaluetype_ro`boxunboxvaluetype_ro_NullableTest43__BoxUnboxToNQGen<System.Nullable`1<boxunboxvaluetype_ro_WithMultipleGCHandleStruct>>:
    0x73bb60 <+0>:   push   {r0, r1, r2, r3}
    0x73bb62 <+2>:   push.w {r11, lr}
    0x73bb66 <+6>:   sub    sp, #0x58
    0x73bb68 <+8>:   add.w  r11, sp, #0x58
    0x73bb6c <+12>:  add    r1, sp, #0x60
    0x73bb6e <+14>:  movw   r0, #0xd5ee
    0x73bb72 <+18>:  movt   r0, #0xa
    0x73bb76 <+22>:  add    r0, pc
    0x73bb78 <+24>:  bl     0x64dfe0                  ; RhBox
    0x73bb7c <+28>:  adds   r1, r0, #0x4
    0x73bb7e <+30>:  add    r0, sp, #0x2c
    0x73bb80 <+32>:  movs   r2, #0x14
->  0x73bb82 <+34>:  blx    0x750d20                  ; symbol stub for: memcpy

The unboxing calls memcpy with a null value which causes a segfault instead of the expected null reference exception.

emmauss commented 3 months ago

Any plans on linux-bionic-arm builds being enabled for nuget?

filipnavara commented 3 months ago

Any plans on linux-bionic-arm builds being enabled for nuget?

Yes. The runtime bits already work on Bionic. There is, however, some packaging issue but I would need @MichalStrehovsky to chime in on that.

MichalStrehovsky commented 3 months ago

Any plans on linux-bionic-arm builds being enabled for nuget?

Yes. The runtime bits already work on Bionic. There is, however, some packaging issue but I would need @MichalStrehovsky to chime in on that.

99667 should add it.

sonatique commented 2 months ago

Hello all, thanks a lot for all these wonderful efforts. I really look forward to being able to use Arm32 AOT. I am quite new here (I mean to the cutting edge .Net development process). Given that all PR I can see above are merged into dotnet:main for a while now, I naively tried to use one of the latest dailybuild from here: https://github.com/dotnet/installer#installers-and-binaries: I unziped dotnet-sdk-8.0.300-preview.24209.26-win-x64.zip and started using it. I was able to publish an Hello World as Win-x64 AOT, but trying -c linux-arm always tell me: "c:\dotnet\sdk\8.0.300-preview.24209.26\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.FrameworkReferenceResolution.ta rgets(90,5): error NETSDK1203: Ahead-of-time compilation is not supported for the target runtime identifier 'linux-arm''" like if I was on plain .Net 8.

I think I tried to follow all indications found on https://github.com/dotnet/runtime/blob/main/docs/project/dogfooding.md and https://github.com/dotnet/installer?tab=readme-ov-file#installers-and-binaries but to no avail.

Could you tell me what I am doing wrong? Is there a resource to help newcomers to built AOT for linux ARM32? Sorry if it is obvious where to look at and I missed it.

Thanks in advance!

Suchiman commented 2 months ago

I unziped dotnet-sdk-8.0.300-preview.24209.26-win-x64.zip and started using it

That is a .NET 8 build, you need a .NET 9 build, so this column image

Also cross compilation is not supported out of the box yet, so you need to be on a linux machine to publish linux-arm.

sonatique commented 2 months ago

@Suchiman : thanks a lot for your crazy fast reply! Ok I see, I completely overlooked the fact that the feature was targeted for .Net 9 (and obviously that cross-compilation is not yet supported).

OK cool, thanks again I'll try based on this!

sonatique commented 2 months ago

Also cross compilation is not supported out of the box yet, so you need to be on a linux machine to publish linux-arm.

Sorry: just to be sure: do I need to be on a linux ARM32 machine or any linux will do? What about being on a linux x64 for instance? Does the processor architecture part of the cross-compilation works (unlike the OS part?) or there is no cross-compilation at all?

Thanks!

filipnavara commented 2 months ago

do I need to be on a linux ARM32 machine or any linux will do? What about being on a linux x64 for instance? Does the processor architecture part of the cross-compilation works (unlike the OS part?) or there is no cross-compilation at all?

Any Linux will do. Compiling on linux-x64 (cross) and linux-arm (native) are the two tested scenarios.

(You may need appropriate native cross-compilation tools. I think debootstrap can be used to download the necessary libraries, clang and lld can cross-compile natively.)

sonatique commented 2 months ago

@filipnavara thanks for your equally prompt and informative reply!

sonatique commented 2 months ago

Anyone as a (good) experience using WSL 2 for the linux-x64 machine when doing AOT for linux-arm?

filipnavara commented 2 months ago

Anyone as a (good) experience using WSL 2 for the linux-x64 machine when doing AOT for linux-arm?

Yes, I used WSL 2 for the initial bring-up and even used QEMU user emulation to run some of the tests.

sonatique commented 2 months ago

@filipnavara , cool, I'll start that way then.

Something else I am thinking now is that it would be super cool if someone with experience would publish a docket container with everything required to instantly have AOT for linux-arm up and running and being able to focus on testing / contributing. Any chance that this exist somewhere already?

filipnavara commented 2 months ago

I am not a Docker fan but it may already exist at https://github.com/dotnet/dotnet-buildtools-prereqs-docker (and the prebuilt images from that repository are hosted by MS). (/cc @am11 if you have more insight on that)

sonatique commented 2 months ago

Thanks a lot @filipnavara . I am not really into Docker myself neither, but it came to my mind that in this case it could be a good way to quickly have my first AOT built binary to play with (I am especially interested in comparing startup time and other performance related things in between AOT and non-AOT in order to validate whether AOT is the way to go for my project) before getting serious about it.

Thanks again: I will try this images and then WSL 2 (or the other way round, whichever is simpler for me ;-) )

am11 commented 2 months ago

On Apple Silicon, Docker depends on qemu and provides Roestta2 emulation option as an alternative. Unfortunately, Rosetta2 only supports x64 and arm64, so we have to use qemu which comes bundled with Docker Desktop. I was running into https://github.com/docker/for-mac/issues/7172, which turned out to be mmap issue. So DOTNET_EnableWriteXorExecute=0 for the rescue!

@filipnavara, while preparing a solution for @sonatique, I ran into an assertion. 🥲

Repro:

FROM --platform=linux/arm/v7 ubuntu:latest AS builder

RUN apt update && apt install -y clang zlib1g-dev curl

RUN mkdir -p "$HOME/.dotnet9" "$HOME/.nuget/NuGet"; \
  curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --quality daily --channel 9.0 --install-dir "$HOME/.dotnet9"; \
  cat > "$HOME/.nuget/NuGet/NuGet.Config" <<EOF
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
    <add key="dotnet9" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet9/nuget/v3/index.json" />
</packageSources>
</configuration>
EOF

ENV DOTNET_NOLOGO=1
ENV DOTNET_EnableWriteXorExecute=0

RUN export dotnet9="$HOME/.dotnet9/dotnet"; \
  "$dotnet9" new webapiaot -n webapi1 && \
  "$dotnet9" publish webapi1 -o dist -c Release

FROM --platform=linux/arm/v7 ubuntu:latest
COPY --from=builder /root/dist /app

CMD ["/app/webapi1"]

How to run it:

$ mkdir /tmp/arm-builder
$ cd /tmp/arm-builder
$ nano Dockerfile
# paste the contents posted above, save and close the editor.
$ docker build . -t armv7-nativeaot-webapi

[builder 4/4] RUN export dotnet9="$HOME/.dotnet9/dotnet"; "$dotnet9" new webapiaot -n webapi1 && "$dotnet9" publish webapi1 -o dist -c Release:
10.40 The template "ASP.NET Core Web API (native AOT)" was created successfully.
10.41
10.41 Processing post-creation actions...
10.41 Restoring /webapi1/webapi1.csproj: 15.71 Determining projects to restore... 31.18 Assertion failed: (dc->base.pc_next & 1) == 0 (../target/arm/tcg/translate.c: thumb_tr_translate_insn: 9428) 31.20 Aborted

filipnavara commented 2 months ago

I ran into an assertion.

That’s assertion in QEMU. While I enjoy debugging things like these it’s not likely to get fixed any time soon :)

am11 commented 2 months ago

That’s assertion in QEMU.

Yup, from https://gitlab.com/qemu-project/qemu/-/blob/02e16ab9f4f19c4bdd17c51952d70e2ded74c6bf/target/arm/tcg/translate.c#L9429.

am11 commented 2 months ago

@sonatique, meanwhile, we can use cross build option using host architecture builder and then test the binary with emulation (nativeaot binaries are not crashing under qemu).

Dockerfile:

FROM --platform=$BUILDPLATFORM ubuntu:latest AS builder

RUN apt update && apt install -y clang debootstrap curl lld llvm

RUN mkdir -p "$HOME/.dotnet9" "$HOME/.nuget/NuGet"; \
  curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --quality daily --channel 9.0 --install-dir "$HOME/.dotnet9"; \
  cat > "$HOME/.nuget/NuGet/NuGet.Config" <<EOF
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
    <add key="dotnet9" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet9/nuget/v3/index.json" />
</packageSources>
</configuration>
EOF

ENV DOTNET_NOLOGO=1
ENV ROOTFS_DIR=/crossrootfs/arm

RUN mkdir /dev/arm; \
  curl -sSL https://raw.githubusercontent.com/dotnet/arcade/main/eng/common/cross/arm/sources.list.jammy -o /dev/arm/sources.list.jammy; \
  curl -sSL https://raw.githubusercontent.com/dotnet/arcade/main/eng/common/cross/build-rootfs.sh |\
    bash /dev/stdin arm jammy llvm15 lldb15

RUN export dotnet9="$HOME/.dotnet9/dotnet"; \
  "$dotnet9" new webapiaot -n webapi1 && \
  "$dotnet9" publish webapi1 -o dist -c Release -r linux-arm -p:LinkerFlavor=lld \
    -p:ObjCopy=llvm-objcopy -p:SysRoot="$ROOTFS_DIR"

Usage:

$ mkdir /tmp/arm-builder
$ cd /tmp/arm-builder
$ nano Dockerfile
# paste the contents posted above, save and close the editor.

# build the container
$ docker build . -t armv7-nativeaot-webapi
# create a temporary instance and copy the artifacts locally
$ docker cp $(docker create --name webapi1 armv7-nativeaot-webapi):/dist/webapi1 .
$ docker rm webapi1

# test the binary by running it with arm container
$ docker run -e 'ASPNETCORE_URLS=http://*:8080' --rm -v$(pwd):/app --platform linux/arm/v7 \
    -p 8080:8080 ubuntu /app/webapi1

# in a separate terminal
$ curl -s http://localhost:8080/todos | jq
[
  {
    "id": 1,
    "title": "Walk the dog",
    "dueBy": null,
    "isComplete": false
  },
  {
    "id": 2,
    "title": "Do the dishes",
    "dueBy": "2024-04-11",
    "isComplete": false
  },
  {
    "id": 3,
    "title": "Do the laundry",
    "dueBy": "2024-04-12",
    "isComplete": false
  },
  {
    "id": 4,
    "title": "Clean the bathroom",
    "dueBy": null,
    "isComplete": false
  },
  {
    "id": 5,
    "title": "Clean the car",
    "dueBy": "2024-04-13",
    "isComplete": false
  }
]
sonatique commented 2 months ago

@am11 : thank you very much for what you provided. I think this will be quite helpful. I am now trying using WSL, docker and images provided by https://mcr.microsoft.com/en-us/product/dotnet/nightly/sdk/tags .

I am currently testing with building the default console program natively (no AOT yet), expecting linux-x64 using mcr.microsoft.com/dotnet/nightly/sdk:9.0-preview-jammy-aot but I keep getting " Unable to find package Microsoft.NET.ILLink.Tasks with version (>= 9.0.0-preview.4.24211.4)

It seems there is something wrong with nuget configuration in the docker image that MS provides, or (more probably) I am missing something.

I will try a little bit more and then I think I'll turn to your solution, thanks a lot for providing it!

EDIT: creating the .nuget folders and file and filling it with the content I can see in your dockerfile, and as indicated here: https://github.com/dotnet/runtime/blob/main/docs/project/dogfooding.md#install-prerequisites makes everything works.

Why there is not even a .nuget folder in the MS image is beyond me.

sonatique commented 2 months ago

Hello @am11 , I tried my best with you dockerfile but it always fails during docker build when trying to execute build-roofs, with

409.4 Building dependency tree...
410.0 E: Unable to locate package liblldb-3.9-dev
410.0 E: Couldn't find any package by glob 'liblldb-3.9-dev'
410.0 E: Couldn't find any package by regex 'liblldb-3.9-dev'

Using "llvm" instead of "llvm14" at bash /dev/stdin arm bionic llvm14 let the docker build finish..

However, when doing: dotnet9 publish -r linux-arm -v diag on a previously created default console project to which I just modified the PublishAot property to be true, I get /usr/bin/ld.bfd: unrecognised emulation mode: armelf_linux_eabi

If fact this is exactly the same error that I got when trying with mcr.microsoft.com/dotnet/nightly/sdk:9.0-preview-jammy-aot + adding the .nget config file.

So probably that llvm14 is required but I cannot manage to pass this failure with liblldb-3.9-dev...

am11 commented 2 months ago

Hey @sonatique, sorry for the confusion. I was testing these steps in a container and stitched together as a Dockefile, should had tested the final version. 😅

I've now updated the docker and executed all steps in WSL, lets give it another try!

Changes:

sonatique commented 2 months ago

@am11 : trying now. Thanks a lot!

sonatique commented 2 months ago

@am11 . I am still getting the same issue. I must be doing something wrong. Here is what I get after I docker built with your newer dockerfile (I just commented the last RUN line). Everything went smooth up to the dotnet publish command:

/root/.dotnet9/dotnet publish -r linux-arm -v diag
/root/.dotnet9/sdk/9.0.100-preview.4.24215.10/MSBuild.dll -nologo --property:_IsPublishing=true -property:RuntimeIdentifier=linux-arm -property:_CommandLineDefinedRuntimeIdentifier=true -property:Configuration=Release -distributedlogger:Microsoft.DotNet.Tools.MSBuild.MSBuildLogger,/root/.dotnet9/sdk/9.0.100-preview.4.24215.10/dotnet.dll*Microsoft.DotNet.Tools.MSBuild.MSBuildForwardingLogger,/root/.dotnet9/sdk/9.0.100-preview.4.24215.10/dotnet.dll -maxcpucount -restore -target:Publish -tlp:default=auto -verbosity:m -verbosity:diag ./test-console-1.csproj
Restore complete (0.9s)
    Determining projects to restore...
    All projects are up-to-date for restore.
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  test-console-1 failed with 2 error(s) (1.5s) → bin/Release/net9.0/linux-arm/test-console-1.dll
    /usr/bin/ld.bfd: unrecognised emulation mode: armelf_linux_eabi
    Supported emulations: elf_x86_64 elf32_x86_64 elf_i386 elf_iamcu elf_l1om elf_k1om i386pep i386pe
    clang : error : linker command failed with exit code 1 (use -v to see invocation)

any idea by chance? Note that I am on a x64 machine. Since Windows ARM64 is not very common I didn't mention it earlier, but I see no arm* "supported emulations" in the list, I wonder why.

In case you wonder, here is my csproj file:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net9.0</TargetFramework>
    <RootNamespace>test_console_1</RootNamespace>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <PublishTrimmed>true</PublishTrimmed>
    <PublishSingleFile>true</PublishSingleFile>
    <InvariantGlobalization>true</InvariantGlobalization>
    <PublishAot>true</PublishAot>
  </PropertyGroup>

</Project>

and I just have the default program.cs generated by dotnet new console

Thanks in advance!

am11 commented 2 months ago

Here is what I get after I docker built with your newer dockerfile

First try the exact Dockerfile, if that works (which it does in two machines i've tested on) then customize to your project.

sonatique commented 2 months ago

@am11 : well obviously I should have started here, because, as you expected, everything complete without error with your full dockerfile. I was a bit bold thinking I could directly do what I wanted. Thanks a lot, you have been invaluably useful. I will now try to customize to suit my needs.

EDIT: I have been able to achieve what I wanted, super great, thanks!