Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Crash when kernel debugging OS X after hitting breakpoint several times #49927

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR50958
Status NEW
Importance P normal
Reported by Dennis Christopher James (tobaljackson@gmail.com)
Reported on 2021-07-01 13:38:20 -0700
Last modified on 2021-07-09 12:51:30 -0700
Version 12.0
Hardware PC MacOS X
CC clayborg@gmail.com, jdevlieghere@apple.com, jmolenda@apple.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Hello,

I'm currently using lldb-1205.0.27.3 on host OS X 11.3.1 to kernel-debug an OS X
guest (version 11.4) running under VMWare Fusion 12.1.2, and am reliably
crashing any time I hit a breakpoints more than ~15 times. This issue was
similarly reproducible on an identical guest version (11.3.1) as the host, but I
upgraded the guest to see if that had any effect on the crashing (it didn't).

I've reproduced the crash using both the gdb-stub facility provided by vmware
(gdb-remote 8864), as well as performing regular network-based debugging (lldb
-o "kdp-remote <ip address>").

Each time I try to hit a breakpoint more than ~15 times and a crash occurs, the
backtrace looks similar to the one reproduced here:

----------------------------------------
<truncated>
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #22, name = '0xffffff86986ec640', queue = 'cpu-1', stop reason =
breakpoint 1.1
    frame #0: 0xffffff8020c814f4 kernel`mach_msg_trap(args=0xffffffa06e3fbf00) at mach_msg.c:725:16 [opt]
Target 0: (kernel) stopped.
(lldb) c
Process 1 resuming
(lldb) PLEASE submit a bug report to https://bugs.llvm.org/ and include the
crash backtrace.
0  lldb                     0x000000010a227de5
llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 37
1  lldb                     0x000000010a2274e5 llvm::sys::RunSignalHandlers() +
85
2  lldb                     0x000000010a228646 SignalHandler(int) + 262
3  libsystem_platform.dylib 0x00007fff20451d7d _sigtramp + 29
4  libc++.1.dylib           0x00007fff203a3535
std::__1::recursive_mutex::unlock() + 9
5  LLDB                     0x000000010a718745
lldb_private::ThreadPlan::PlanExplainsStop(lldb_private::Event*) + 37
6  LLDB                     0x000000010a70e6bf
lldb_private::Thread::ShouldStop(lldb_private::Event*) + 1151
7  LLDB                     0x000000010a716786
lldb_private::ThreadList::ShouldStop(lldb_private::Event*) + 822
8  LLDB                     0x000000010a6c36d4
lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*) + 436
9  LLDB                     0x000000010a6bfd49
lldb_private::Process::HandlePrivateEvent(std::__1::shared_ptr<lldb_private::Event>&)
+ 265
10 LLDB                     0x000000010a6c4518
lldb_private::Process::RunPrivateStateThread(bool) + 1496
11 LLDB                     0x000000010a6c3b05
lldb_private::Process::PrivateStateThread(void*) + 21
12 LLDB                     0x000000010a6048a7
lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) + 103
13 libsystem_pthread.dylib  0x00007fff2040c954 _pthread_start + 224
14 libsystem_pthread.dylib  0x00007fff204084a7 thread_start + 15
[1]    84306 segmentation fault  lldb
----------------------------------------

Here I set the breakpoint on mach_msg_trap and just hit 'c'ontinue 15 times
until a crash.

Some additional information from connecting to the guest (after gdb-remote or
lldb -o "kdp-remote <ip>"):

================================================================================

WARNING: Python 2.7 is not recommended. Future versions of lldb will not
support Python 2.7.
(lldb) gdb-remote 8864
Kernel UUID: 52A1E876-863E-38E3-AC80-09BBAB13B752
Load Address: 0xffffff8020c10000
Loading kernel debugging from
/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel.dSYM/Contents/Resources/Python/kernel.py
LLDB version lldb-1205.0.27.3
Apple Swift version 5.4 (swiftlang-1205.0.26.9 clang-1205.0.19.55)
settings set target.process.python-os-plugin-path
"/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel.dSYM/Contents/Resources/Python/lldbmacros/core/operating_system.py"
Target arch: x86_64
Connected to live debugserver or arm core. Will associate on-core threads to
registers reported by server.
settings set target.trap-handler-names hndl_allintrs hndl_alltraps
trap_from_kernel hndl_double_fault hndl_machine_check _fleh_prefabt
_ExceptionVectorsBase _ExceptionVectorsTable _fleh_undef _fleh_dataabt
_fleh_irq _fleh_decirq _fleh_fiq_generic _fleh_dec
command script import
"/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel.dSYM/Contents/Resources/Python/lldbmacros/xnu.py"
xnu debug macros loaded successfully. Run showlldbtypesummaries to enable type
summaries.
settings set target.process.optimization-warnings false

Kernel slid 0x20a10000 in memory.
Loaded kernel file
/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel
Loading kernel debugging from
/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel.dSYM/Contents/Resources/Python/kernel.py
LLDB version lldb-1205.0.27.3
Apple Swift version 5.4 (swiftlang-1205.0.26.9 clang-1205.0.19.55)
settings set target.process.python-os-plugin-path
"/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel.dSYM/Contents/Resources/Python/lldbmacros/core/operating_system.py"
Target arch: x86_64
Connected to live debugserver or arm core. Will associate on-core threads to
registers reported by server.
settings set target.trap-handler-names hndl_allintrs hndl_alltraps
trap_from_kernel hndl_double_fault hndl_machine_check _fleh_prefabt
_ExceptionVectorsBase _ExceptionVectorsTable _fleh_undef _fleh_dataabt
_fleh_irq _fleh_decirq _fleh_fiq_generic _fleh_dec
command script import
"/Library/Developer/KDKs/KDK_11.4_20F71.kdk/System/Library/Kernels/kernel.dSYM/Contents/Resources/Python/lldbmacros/xnu.py"
xnu debug macros loaded successfully. Run showlldbtypesummaries to enable type
summaries.
settings set target.process.optimization-warnings false

Target arch: x86_64
Connected to live debugserver or arm core. Will associate on-core threads to
registers reported by server.
Loading 132 kext modules -----.-------.------....-------------.-------..----.---
----------------------.....--------------.---.-----.----.---.--.-------------
done.
Failed to load 111 of 132 kexts:
<truncated>

================================================================================

Please let me know if you'd like any additional information.

Thank you
Quuxplusone commented 3 years ago

I would guess there is memory corruption going on inside LLDB. Can you try running lldb with libgmalloc and seeing if you crash in a different location?

DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib /path/to/lldb

Quuxplusone commented 3 years ago
Hello Greg,
Thank you for the suggestion. I ran lldb as you indicated, connected to my
guest kernel, set the breakpoint again on mach_msg_trap, and continued until
crash. I noticed the startup of lldb was much slower and included output from
"GuardMalloc" which is how I know the dyld got loaded.

The crash output looks very similar to what I posted previously, but I've
included it here in case there is some more information to be gleaned from this
new crash running with libgmalloc:

----------------------------------------
(lldb) PLEASE submit a bug report to https://bugs.llvm.org/ and include the
crash backtrace.
0  lldb                     0x0000000108e1cde5
llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 37
1  lldb                     0x0000000108e1c4e5 llvm::sys::RunSignalHandlers() +
85
2  lldb                     0x0000000108e1d646 SignalHandler(int) + 262
3  libsystem_platform.dylib 0x00007fff2041dd7d _sigtramp + 29
4  libsystem_platform.dylib 0x000000056a0bfde0 _sigtramp + 18446603363228983424
5  LLDB                     0x000000010931c8d5
lldb_private::Thread::ShouldStop(lldb_private::Event*) + 1685
6  LLDB                     0x0000000109324786
lldb_private::ThreadList::ShouldStop(lldb_private::Event*) + 822
7  LLDB                     0x00000001092d190c
lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*) + 1004
8  LLDB                     0x00000001092cdd49
lldb_private::Process::HandlePrivateEvent(std::__1::shared_ptr<lldb_private::Event>&)
+ 265
9  LLDB                     0x00000001092d2518
lldb_private::Process::RunPrivateStateThread(bool) + 1496
10 LLDB                     0x00000001092d1b05
lldb_private::Process::PrivateStateThread(void*) + 21
11 LLDB                     0x00000001092128a7
lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) + 103
12 libsystem_pthread.dylib  0x00007fff203d8954 _pthread_start + 224
13 libsystem_pthread.dylib  0x00007fff203d44a7 thread_start + 15
[1]    11276 segmentation fault
DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib /usr/bin/lldb
----------------------------------------

Please let me know if there's anything more you think I should try or
information you'd like that could help resolve this bug.

Thank you
Quuxplusone commented 3 years ago

if this crashed in the same spot, it is likely not heap corruption! Thanks for helping narrow this down. My next guess would be to enable ASAN on a LLDB build and try running it with your target.