dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.48k stars 4.76k forks source link

[NativeAOT][osx-arm64] Crash when loading shared library as an LLDB plugin #99172

Closed lambdageek closed 9 months ago

lambdageek commented 9 months ago

I'm am writing an LLDB plugin using NativeAOT. When I load my plugin into Apple's LLDB, it crashes when trying to change the memory protection on __security_cookie.

Repro

dotnet new classlib

using System.Runtime.InteropServices;
namespace l2;

public class Class1
{
        // lldb::PluginInitialize(SBDebugger)
        [UnmanagedCallersOnly(EntryPoint="_ZN4lldb16PluginInitializeENS_10SBDebuggerE")]
        public static int LLDBPluginInitialize()
        {
                Console.WriteLine ("hello");
                return 1;
        }

}
$ dotnet publish -r osx-arm64 -p:PublishAot=true

Expected output

$ lldb -b -o "plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib"
(lldb) plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib
hello

Actual output

$ lldb -b -o "plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib"
(lldb) plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib
PLEASE submit a bug report to https://developer.apple.com/bug-reporting/ and include the crash backtrace.
Stack dump:
0.  Program arguments: /Applications/Xcode-15.2.0.app/Contents/Developer/usr/bin/lldb -b -o "plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib"
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lldb                     0x000000010099b7dc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  lldb                     0x000000010099ad38 llvm::sys::RunSignalHandlers() + 112
2  lldb                     0x000000010099be14 SignalHandler(int) + 304
3  libsystem_platform.dylib 0x0000000182515a24 _sigtramp + 56
4  libsystem_pthread.dylib  0x00000001824e5cc0 pthread_kill + 288
5  libsystem_c.dylib        0x00000001823f1a40 abort + 180
6  l2.dylib                 0x0000000101e2bdac
7  l2.dylib                 0x0000000101de2db8
8  l2.dylib                 0x0000000101eafd84 lldb::PluginInitialize(lldb::SBDebugger) + 20
9  LLDB                     0x0000000112d5d88c LoadPlugin(std::__1::shared_ptr<lldb_private::Debugger> const&, lldb_private::FileSpec const&, lldb_private::Status&) + 244
10 LLDB                     0x0000000112f01824 lldb_private::Debugger::LoadPlugin(lldb_private::FileSpec const&, lldb_private::Status&) + 92
11 LLDB                     0x00000001136074a4 CommandObjectPluginLoad::DoExecute(lldb_private::Args&, lldb_private::CommandReturnObject&) + 164
12 LLDB                     0x0000000112fedad0 lldb_private::CommandObjectParsed::Execute(char const*, lldb_private::CommandReturnObject&) + 656
13 LLDB                     0x0000000112fe475c lldb_private::CommandInterpreter::HandleCommand(char const*, lldb_private::LazyBool, lldb_private::CommandReturnObject&, bool) + 2024
14 LLDB                     0x0000000112fe7f9c lldb_private::CommandInterpreter::IOHandlerInputComplete(lldb_private::IOHandler&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) + 828
15 LLDB                     0x0000000112f1f7c8 lldb_private::IOHandlerEditline::Run() + 304
16 LLDB                     0x0000000112f040c4 lldb_private::Debugger::RunIOHandlers() + 140
17 LLDB                     0x0000000112fe9178 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) + 196
18 LLDB                     0x0000000112d65afc lldb::SBDebugger::RunCommandInterpreter(lldb::SBCommandInterpreterRunOptions const&) + 112
19 lldb                     0x000000010098c034 Driver::MainLoop() + 2068
20 lldb                     0x000000010098cd14 main + 2036
21 dyld                     0x00000001821650e0 start + 2360

The Console.app shows a better crash dump with symbols

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib                 0x1824ae0dc __pthread_kill + 8
1   libsystem_pthread.dylib                0x1824e5cc0 pthread_kill + 288
2   libsystem_c.dylib                      0x1823f1ad4 __abort + 136
3   libsystem_c.dylib                      0x1823f1a4c abort + 192
4   l2.dylib                               0x101e2bdac RaiseFailFastException + 16 (PalRedhawkUnix.cpp:94)
5   l2.dylib                               0x101de2db8 PalRaiseFailFastException(_EXCEPTION_RECORD32*, _CONTEXT*, unsigned int) + 16 (PalRedhawkFunctions.h:144) [inlined]
6   l2.dylib                               0x101de2db8 Thread::EnsureRuntimeInitialized() + 84 (thread.cpp:1215) [inlined]
7   l2.dylib                               0x101de2db8 Thread::ReversePInvokeAttachOrTrapThread(ReversePInvokeFrame*) + 188 (thread.cpp:1176)
8   l2.dylib                               0x101eafd84 lldb::PluginInitialize(lldb::SBDebugger) + 20 (Class1.cs:10)
9   LLDB                                   0x112d5d88c LoadPlugin(std::__1::shared_ptr<lldb_private::Debugger> const&, lldb_private::FileSpec const&, lldb_private::Status&) + 244

Repros with .NET 8.0.2 and .NET 9 main.

With some printf debugging I found that the crash happens because this call fails:

https://github.com/dotnet/runtime/blob/91008723d1f2c2daa29d69e1bd641ef651e39dc8/src/coreclr/nativeaot/Runtime/startup.cpp#L173-L176

Which fails because this call to mprotect fails:

https://github.com/dotnet/runtime/blob/91008723d1f2c2daa29d69e1bd641ef651e39dc8/src/coreclr/nativeaot/Runtime/unix/PalRedhawkUnix.cpp#L831

lambdageek commented 9 months ago

It only reproduces when the shared library is loaded as an LLDB plugin.

Loading it into a user-built host app that is codesigned to use the Hardened Runtime doesn't cause the same crash - mprotect works fine


I read somewhere that vm_protect(.... , VM_PROT_WRITE | VM_PROT_COPY) might work more frequently than mprotect or vm_protect without the VM_PROT_COPY, but the following doesn't fix it:

    vm_prot_t machProtect = UnixProtectToMachProtect(unixProtect);
    mach_port_t selfTask = mach_task_self();
    printf ("trying first mach_vm_protect (selfTask, %p, %zu, 0, 0x%08x)\n", pPageStart, memSize, (int)machProtect);
    kern_return_t ret = mach_vm_protect (selfTask, (mach_vm_address_t)pPageStart, (mach_vm_size_t)memSize, /*setMaximum*/0, machProtect);
    printf("first vm_protect returned %d (eq KERN_SUCCESS? %s)\n", (int)ret, (ret == KERN_SUCCESS) ? "yes" : "no");
    /* see mach/vm_prot.h VM_PROT_COPY note:
     *      When a caller finds that he cannot obtain write permission on a
     *      mapped entry, the following flag can be used.  The entry will
     *      be made "needs copy" effectively copying the object (using COW),
     *      and write permission will be added to the maximum protections
     *      for the associated entry.
     */
    if (ret != KERN_SUCCESS && (machProtect & VM_PROT_WRITE) != 0) {
    printf ("trying second mach_vm_protect (selfTask, %p, %zu, 0, 0x%08x)\n", pPageStart, memSize, (int)(machProtect | VM_PROT_COPY));
    ret = mach_vm_protect(selfTask, (mach_vm_address_t)pPageStart, (mach_vm_size_t)memSize, 0, machProtect | VM_PROT_COPY);
    printf("second mach_vm_protect returned %d (eq KERN_SUCCESS? %s)\n", (int)ret, (ret == KERN_SUCCESS) ? "yes" : "no");
    }
    return ret != KERN_SUCCESS; /* PalVirtualProtect returns 0 on success */
ghost commented 9 months ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

Issue Details
I'm am writing an [LLDB plugin](https://lldb.llvm.org/resources/sbapi.html) using NativeAOT. When I load my plugin into Apple's LLDB it crashes when trying to change the memory protection on `__security_cookie`. **Repro** `dotnet new classlib` ```csharp using System.Runtime.InteropServices; namespace l2; public class Class1 { // lldb::PluginInitialize(SBDebugger) [UnmanagedCallersOnly(EntryPoint="_ZN4lldb16PluginInitializeENS_10SBDebuggerE")] public static int LLDBPluginInitialize() { Console.WriteLine ("hello"); return 1; } } ``` ```console $ dotnet publish -r osx-arm64 -p:PublishAot=true ``` **Expected output** ```console $ lldb -b -o "plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib" (lldb) plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib hello ``` **Actual output** ```console $ lldb -b -o "plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib" (lldb) plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib PLEASE submit a bug report to https://developer.apple.com/bug-reporting/ and include the crash backtrace. Stack dump: 0. Program arguments: /Applications/Xcode-15.2.0.app/Contents/Developer/usr/bin/lldb -b -o "plugin load ./bin/Release/net8.0/osx-arm64/publish/l2.dylib" Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it): 0 lldb 0x000000010099b7dc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56 1 lldb 0x000000010099ad38 llvm::sys::RunSignalHandlers() + 112 2 lldb 0x000000010099be14 SignalHandler(int) + 304 3 libsystem_platform.dylib 0x0000000182515a24 _sigtramp + 56 4 libsystem_pthread.dylib 0x00000001824e5cc0 pthread_kill + 288 5 libsystem_c.dylib 0x00000001823f1a40 abort + 180 6 l2.dylib 0x0000000101e2bdac 7 l2.dylib 0x0000000101de2db8 8 l2.dylib 0x0000000101eafd84 lldb::PluginInitialize(lldb::SBDebugger) + 20 9 LLDB 0x0000000112d5d88c LoadPlugin(std::__1::shared_ptr const&, lldb_private::FileSpec const&, lldb_private::Status&) + 244 10 LLDB 0x0000000112f01824 lldb_private::Debugger::LoadPlugin(lldb_private::FileSpec const&, lldb_private::Status&) + 92 11 LLDB 0x00000001136074a4 CommandObjectPluginLoad::DoExecute(lldb_private::Args&, lldb_private::CommandReturnObject&) + 164 12 LLDB 0x0000000112fedad0 lldb_private::CommandObjectParsed::Execute(char const*, lldb_private::CommandReturnObject&) + 656 13 LLDB 0x0000000112fe475c lldb_private::CommandInterpreter::HandleCommand(char const*, lldb_private::LazyBool, lldb_private::CommandReturnObject&, bool) + 2024 14 LLDB 0x0000000112fe7f9c lldb_private::CommandInterpreter::IOHandlerInputComplete(lldb_private::IOHandler&, std::__1::basic_string, std::__1::allocator>&) + 828 15 LLDB 0x0000000112f1f7c8 lldb_private::IOHandlerEditline::Run() + 304 16 LLDB 0x0000000112f040c4 lldb_private::Debugger::RunIOHandlers() + 140 17 LLDB 0x0000000112fe9178 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) + 196 18 LLDB 0x0000000112d65afc lldb::SBDebugger::RunCommandInterpreter(lldb::SBCommandInterpreterRunOptions const&) + 112 19 lldb 0x000000010098c034 Driver::MainLoop() + 2068 20 lldb 0x000000010098cd14 main + 2036 21 dyld 0x00000001821650e0 start + 2360 ``` The Console.app shows a better crash dump with symbols ``` Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libsystem_kernel.dylib 0x1824ae0dc __pthread_kill + 8 1 libsystem_pthread.dylib 0x1824e5cc0 pthread_kill + 288 2 libsystem_c.dylib 0x1823f1ad4 __abort + 136 3 libsystem_c.dylib 0x1823f1a4c abort + 192 4 l2.dylib 0x101e2bdac RaiseFailFastException + 16 (PalRedhawkUnix.cpp:94) 5 l2.dylib 0x101de2db8 PalRaiseFailFastException(_EXCEPTION_RECORD32*, _CONTEXT*, unsigned int) + 16 (PalRedhawkFunctions.h:144) [inlined] 6 l2.dylib 0x101de2db8 Thread::EnsureRuntimeInitialized() + 84 (thread.cpp:1215) [inlined] 7 l2.dylib 0x101de2db8 Thread::ReversePInvokeAttachOrTrapThread(ReversePInvokeFrame*) + 188 (thread.cpp:1176) 8 l2.dylib 0x101eafd84 lldb::PluginInitialize(lldb::SBDebugger) + 20 (Class1.cs:10) 9 LLDB 0x112d5d88c LoadPlugin(std::__1::shared_ptr const&, lldb_private::FileSpec const&, lldb_private::Status&) + 244 ``` Repros with .NET 8.0.2 and .NET 9 `main`. With some printf debugging I found that the crash happens because this call fails: https://github.com/dotnet/runtime/blob/91008723d1f2c2daa29d69e1bd641ef651e39dc8/src/coreclr/nativeaot/Runtime/startup.cpp#L173-L176 Which fails because this call to `mprotect` fails: https://github.com/dotnet/runtime/blob/91008723d1f2c2daa29d69e1bd641ef651e39dc8/src/coreclr/nativeaot/Runtime/unix/PalRedhawkUnix.cpp#L831
Author: lambdageek
Assignees: -
Labels: `untriaged`, `area-NativeAOT-coreclr`, `needs-area-label`
Milestone: -
lambdageek commented 9 months ago

I'm going to open a PR to disable FEATURE_READONLY_GS_COOKIE on MacOS X. It's already for the ios-like platforms:

https://github.com/dotnet/runtime/blob/91008723d1f2c2daa29d69e1bd641ef651e39dc8/src/coreclr/nativeaot/Runtime/CMakeLists.txt#L274-L276