unfriendly error message when debugee does not start correctly

Quuxplusone commented 3 years ago


Bugzilla Link	PR48723
Status	NEW
Importance	P normal
Reported by	emaste@freebsd.org
Reported on	2021-01-11 12:45:36 -0800
Last modified on	2021-06-07 12:43:36 -0700
Version	unspecified
Hardware	PC All
CC	clayborg@gmail.com, jdevlieghere@apple.com, llvm-bugs@lists.llvm.org, sylvestre@debian.org
Fixed by commit(s)
Attachments	`packets.txt` (2595 bytes, text/plain)
Blocks
Blocked by
See also

On FreeBSD, and I assume other operating systems, there are cases where the
debugged application does not start correctly.

One recent case occurred on FreeBSD with W^X enabled (sysctl
kern.elf64.allow_wx=0) and an attempt to execute a binary that requested an
executable stack. Because W^X disallows mappings with simultaneous W & X
protections the initial stack cannot be created and the kernel image activator
reports SIGABRT from the target.

Similar cases occurred when attempting to execute a binary with a .text segment
larger than a kernel limit.

The failure reported by LLDB in this case is not particularly clear:

$ lldb ./exec-stack
(lldb) target create "./exec-stack"
Current executable set to
'/home/emaste/src/freebsd-git/main/exec-stack' (x86_64).
(lldb) run
error: 'A' packet returned an error: 8

The old in-process target plugin handled this even more poorly:

(lldb) run
Assertion failed: (WIFSTOPPED(status) && wpid == (::pid_t)pid &&
"Could not sync with inferior process."), function Launch, file
/usr/home/emaste/src/freebsd-git/head/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessMonitor.cpp,
line 930.
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and
include the crash backtrace.
#0 0x0000000004444ade PrintStackTrace
/usr/home/emaste/src/freebsd-git/head/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13
#1 0x0000000004442e11 RunSignalHandlers
/usr/home/emaste/src/freebsd-git/head/contrib/llvm-project/llvm/lib/Support/Signals.cpp:69:18
#2 0x0000000004445335 SignalHandler
/usr/home/emaste/src/freebsd-git/head/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
#3 0x0000000805299b20 handle_signal
/usr/home/emaste/src/freebsd-git/head/lib/libthr/thread/thr_sig.c:0:3
Abort trap (core dumped)

Although a relatively minor issue it would be nice to have an error message
along the lines of "Unable to start inferior process" or such. Taking this to
root cause would likely require finding a log message in the kernel log.

(For reference, the FreeBSD code review that prompted this investigation:
https://reviews.freebsd.org/D28050)

Quuxplusone commented 3 years ago

So lldb-server has a way to return better error messages. If lldb-server is
being used in this case, LLDB will send a "QEnableErrorStrings" packet that
allows errors that used to be returned as "E08" ("E" = error, with code = "08")
to also have an error message as hex encoded ASCII like "E08;313233" where
"313233" is the string "123".

The "A" packet seems to be returning only an error code with no error message.
If you can get the native FreeBSD NativeProcessFreeBSD.cpp or
NativeThreadFreeBSD.cpp to figure out this is the error, you can modify the
error message that should be returned back up to LLDB and it should be
propagated.

The first step is to see what is currently happening. Can you enable GDB remote
packet logging with:

(lldb) log enable -f /tmp/packets.txt gdb-remote packets
(lldb) target create "./exec-stack"
(lldb) run

Then attach the packets so we can see what is currently happening?

Quuxplusone commented 3 years ago

Does FreeBSD debugging use lldb-server now? If so, can we remove the old native ProcessFreeBSD process plug-in?

Quuxplusone commented 3 years ago

Attached packets.txt (2595 bytes, text/plain): packet log

Quuxplusone commented 3 years ago

(In reply to Greg Clayton from comment #2)
> Does FreeBSD debugging use lldb-server now? If so, can we remove the old
> native ProcessFreeBSD process plug-in?

We now use lldb-server on FreeBSD/x86, but not yet other CPU architectures.
Over the next few months the remaining architectures should be switched over
and we can remove the native process plug-in.

As an aside, looking at the packet log I see another TODO for FreeBSD -
disabling ASLR:

lldb             <  21> send packet: $QSetDisableASLR:1#ce
lldb             <   6> read packet: $OK#9a

I guess we're ignoring this at the moment.

Another TODO from looking at `ktrace -i lldb ...`:

 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0
 12735 lldb-server CALL  close(0x3)
 12735 lldb-server RET   close -1 errno 9 Bad file descriptor
 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0
 12735 lldb-server CALL  close(0x4)
 12735 lldb-server RET   close 0
 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0
 12735 lldb-server CALL  close(0x5)
 12735 lldb-server RET   close 0
 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0
 12735 lldb-server CALL  close(0x6)
 12735 lldb-server RET   close -1 errno 9 Bad file descriptor
...
 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0
 12735 lldb-server CALL  close(0x718f0)
 12735 lldb-server RET   close -1 errno 9 Bad file descriptor
 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0
 12735 lldb-server CALL  close(0x718f1)
 12735 lldb-server RET   close -1 errno 9 Bad file descriptor
 12735 lldb-server CALL  getrlimit(RLIMIT_NOFILE,0x7fffffe74d20)
 12735 lldb-server RET   getrlimit 0

we need to use closefrom or close_range instead of calling close() in a loop
for about 1M syscalls to close fds

Quuxplusone commented 3 years ago

... to close fds before execve

The failed execve looks like:

 12735 lldb-server CALL  ptrace(PT_TRACE_ME,0,0,0)
 12735 lldb-server RET   ptrace 0
 12735 lldb-server CALL  execve(0x80e1739c0,0x80e1c4060,0x80e21d000)
 12735 lldb-server NAMI  "/usr/home/emaste/src/freebsd-git/head/exec-stack"
 12734 lldb-server GIO   fd 6 read 0 bytes
       ""
 12734 lldb-server RET   read 0
 12734 lldb-server CALL  close(0x6)
 12734 lldb-server RET   close 0
 12734 lldb-server CALL  wait4(0x31bf,0x7fffffe75804,0,0)
 12734 lldb-server RET   wait4 12735/0x31bf
 12734 lldb-server CALL  write(0x7,0x7fffffe75861,0x7)
 12734 lldb-server GIO   fd 7 wrote 7 bytes
       "$E08#ad"
 12734 lldb-server RET   write 7

Quuxplusone commented 3 years ago

Looks like error strings are successfully enabled:

lldb             <  23> send packet: $QEnableErrorStrings#8c
lldb             <   6> read packet: $OK#9a

But there is no extra error info from the "A" packet which launches the
executable:

lldb             < 106> send packet:
$A96,0,2f7573722f686f6d652f656d617374652f7372632f667265656273642d6769742f686561642f657865632d737461636b#85
lldb             <   7> read packet: $E08#ad

Quuxplusone commented 3 years ago

So it should be an easy fix if you can make lldb-server figure out what went wrong and can add the error message.

Quuxplusone commented 3 years ago

I am experiencing this with lldb 13 on linux (debian unstable) in a chroot
(/proc & /dev/shm are mounted):

$ /usr/bin/clang++-13 -g -o foo a.cpp
$ /usr/bin/lldb-13 -s cmd.in foo
(lldb) target create "/check/build/tests/Output/basic_lldb2.cpp.tmp"
Current executable set to '/check/build/tests/Output/basic_lldb2.cpp.tmp'
(x86_64).
(lldb) command source -s 0 '/check/tests/basic_lldb2.in'
Executing commands in '/check/tests/basic_lldb2.in'.
(lldb) b main
Breakpoint 1: where = basic_lldb2.cpp.tmp`main + 16 at basic_lldb2.cpp:7:21,
address = 0x00000000004011c0
(lldb) r
error: 'A' packet returned an error: 8

mounting /dev/pts fixed the issue ( mount --bind /dev/pts
/srv/chroot/stretch/dev/pts )

Quuxplusone / LLVMBugzillaTest

unfriendly error message when debugee does not start correctly #47692