Open lambdageek opened 8 months ago
Tagging subscribers to this area: @vitek-karas, @agocke, @vsadov See info in area-owners.md if you want to be subscribed.
This is likely not a hosting issue. See also https://github.com/dotnet/runtime/issues/99172 - the LLDB environment is sufficiently different that some runtime functionality (for example messing with VM page protections in order to initialize the stack guard cookie) just doesn't succeed.
if you run without the PAL_MachExceptionMode
workaround, Console.app
shows a bit of what went wrong:
Process: lldb [16099]
Path: /Applications/Xcode-15.3.0.app/Contents/Developer/usr/bin/lldb
Identifier: lldb
Version: ???
Code Type: ARM-64 (Native)
Parent Process: zsh [68922]
Responsible: iTerm2 [731]
User ID: 501
Date/Time: 2024-03-19 15:14:23.4631 -0400
OS Version: macOS 14.4 (23E214)
Report Version: 12
Anonymous UUID: 1143D3D0-7711-BC35-8E10-8642D5EAA935
Sleep/Wake UUID: E0C87088-29D9-4EEE-A407-0796A7947768
Time Awake Since Boot: 22000 seconds
Time Since Wake: 9821 seconds
System Integrity Protection: enabled
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_GUARD (SIGKILL)
Exception Codes: GUARD_TYPE_MACH_PORT
Exception Codes: 0x00000000000227d0, 0x0000000000000000
Termination Reason: Namespace GUARD, Code 2305843036766218192
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x196cfa1f4 mach_msg2_trap + 8
1 libsystem_kernel.dylib 0x196d0cb24 mach_msg2_internal + 80
2 libsystem_kernel.dylib 0x196d29db0 thread_swap_exception_ports + 368
3 libcoreclr.dylib 0x10450faa4 CorUnix::CPalThread::EnableMachExceptions() + 108
4 libcoreclr.dylib 0x10450e75c CorUnix::CreateThreadData(CorUnix::CPalThread**) + 280
5 libcoreclr.dylib 0x1044e8508 Initialize(int, char const* const*, unsigned int) + 1244
6 libcoreclr.dylib 0x1044e8964 PAL_InitializeCoreCLR + 60
7 libcoreclr.dylib 0x104512758 coreclr_initialize + 500
8 libhostpolicy.dylib 0x10350acd8 coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t>>&) + 392
9 libhostpolicy.dylib 0x103522e88 (anonymous namespace)::create_coreclr() + 424
10 libhostfxr.dylib 0x103454bf8 fx_muxer_t::run_app(host_context_t*) + 448
11 libhihost.dylib 0x1027a3c60 start_runtime + 596
12 libhihost.dylib 0x1027a37f4 lldb::PluginInitialize(lldb::SBDebugger) + 144
13 LLDB 0x115678c68 lldb::SBDebugger::InitializeWithErrorHandling()::$_0::__invoke(std::__1::shared_ptr<lldb_private::Debugger> const&, lldb_private::FileSpec const&, lldb_private::Status&) + 244
14 LLDB 0x115821850 lldb_private::Debugger::LoadPlugin(lldb_private::FileSpec const&, lldb_private::Status&) + 92
15 LLDB 0x115f3ebf8 CommandObjectPluginLoad::DoExecute(lldb_private::Args&, lldb_private::CommandReturnObject&) + 164
16 LLDB 0x115910de4 lldb_private::CommandObjectParsed::Execute(char const*, lldb_private::CommandReturnObject&) + 660
17 LLDB 0x115907734 lldb_private::CommandInterpreter::HandleCommand(char const*, lldb_private::LazyBool, lldb_private::CommandReturnObject&, bool) + 2172
18 LLDB 0x11590b0d4 lldb_private::CommandInterpreter::IOHandlerInputComplete(lldb_private::IOHandler&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) + 828
19 LLDB 0x1158407a4 lldb_private::IOHandlerEditline::Run() + 304
20 LLDB 0x115823fe0 lldb_private::Debugger::RunIOHandlers() + 140
21 LLDB 0x11590c320 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) + 196
22 LLDB 0x115675510 lldb::SBDebugger::RunCommandInterpreter(lldb::SBCommandInterpreterRunOptions const&) + 112
23 lldb 0x10236b96c Driver::MainLoop() + 2700
24 lldb 0x10236c634 main + 2040
25 dyld 0x1969b20e0 start + 2360
I can actually see another mode of this issue on my 14.4 (SIP disabled). The same failure GUARD_TYPE_MACH_PORT
occurs when we call thread_set_state
during hardware exception handling in the plugin in the SEHExceptionThread
function. So it seems that with SIP disabled, we can pass the PAL initialization, but we crash there. In this case, we are injecting exception handler code into the target thread and since the thread belongs to lldb, it kind of makes sense it may have it guarded.
This is related to https://github.com/dotnet/diagnostics/issues/4259 and https://github.com/dotnet/diagnostics/issues/4551
SOS is an LLDB plugin that is hosts a CoreCLR runtime. It have been failing to work on recent versions of macOS / Xcode, and in Sonoma macOS 14.4 loading the plugin actually kills the LLDB process entirely (see https://github.com/dotnet/diagnostics/issues/4551)
I have only tried on osx-arm64. SIP is not disabled
I have created a standalone repro https://github.com/lambdageek/repro-coreclr-lldb
Build:
Run:
The above happens with macOS Sonoma 14.4. With 14.3, you get a bit further, but the runtime will still fail to initialize.
In https://github.com/dotnet/diagnostics/issues/4551 we found a workaround to at least get past the whole LLDB process aborting, by passing
PAL_MachExceptionMode=7
Expected output (compare with a "normal" hardened runtime macOS app):