llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.84k stars 11.47k forks source link

sanitizer stack trace pc off by 1 #23974

Open llvmbot opened 9 years ago

llvmbot commented 9 years ago
Bugzilla Link 23600
Version 3.6
OS Linux
Attachments Proposed patch.
Reporter LLVM Bugzilla Contributor
CC @kcc,@vitalybuka

Extended Description

(This was originally reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65749)

The PC printed in sanitizer stack traces is consistently off by 1. Either off by 1 byte, or by 1 instruction, depending on the target. Below is a test case and the ASan backtrace for it on x86_64, followed by a backtrace printed by GDB along with the disassembly for reference. As the disassembly shows, the ASan PCs point at the last byte of the previous instruction.

On RISC targets such as powerpc64 the ASan backtrace is off by 1 instruction. Rather than pointing at either the faulting instruction in the active frame in case of a trap or at the next instruction to be executed as GDB does, ASan points at the the instruction just before it.

$ cat -n asan.c && /build/llvm-3.6.0-install/bin/clang -O2 -fasynchronous-unwind-tables -fno-omit-frame-pointer -fsanitize=address -g asan.c && ASAN_SYMBOLIZER_PATH=/build/llvm-3.6.0-install/bin/llvm-symbolizer ./a.out || gdb -batch -q -ex r -ex bt a.out -ex "disas foo" -ex "disas bar" -ex "disas main" 1 void attribute ((weak)) foo (int p) { p = 0; } 2 void attribute__ ((weak)) bar (int *p) { foo (p); foo (p); } 3 int main (void) { bar (0); } ASAN:SIGSEGV

==20556==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000004cdae5 bp 0x7fffe0befe80 sp 0x7fffe0befe80 T0)

​0 0x4cdae4 in foo /build/tmp/asan.c:1:42

#​1 0x4cdb1d in bar /build/tmp/asan.c:2:44
#​2 0x4cdaba in main /build/tmp/asan.c:3:19
#​3 0x7f91b65dffdf in __libc_start_main (/lib64/libc.so.6+0x1ffdf)
#​4 0x416cd6 in _start (/home/msebor/build/tmp/a.out+0x416cd6)

AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV /build/tmp/asan.c:1 foo ==20556==ABORTING

Program received signal SIGSEGV, Segmentation fault. 0x00000000004cdae5 in foo (p=0x0) at asan.c:1 1 void __attribute ((weak)) foo (int p) { p = 0; }

​0 0x00000000004cdae5 in foo (p=0x0) at asan.c:1

​1 0x00000000004cdb1e in bar (p=0x0) at asan.c:2

​2 0x00000000004cdabb in main () at asan.c:3

Dump of assembler code for function foo: 0x00000000004cdad0 <+0>: push %rbp 0x00000000004cdad1 <+1>: mov %rsp,%rbp 0x00000000004cdad4 <+4>: mov %rdi,%rax 0x00000000004cdad7 <+7>: shr $0x3,%rax 0x00000000004cdadb <+11>: mov 0x7fff8000(%rax),%al 0x00000000004cdae1 <+17>: test %al,%al 0x00000000004cdae3 <+19>: jne 0x4cdaed <foo+29> => 0x00000000004cdae5 <+21>: movl $0x0,(%rdi) 0x00000000004cdaeb <+27>: pop %rbp 0x00000000004cdaec <+28>: retq
0x00000000004cdaed <+29>: mov %edi,%ecx 0x00000000004cdaef <+31>: and $0x7,%ecx 0x00000000004cdaf2 <+34>: add $0x3,%ecx 0x00000000004cdaf5 <+37>: movsbl %al,%eax 0x00000000004cdaf8 <+40>: cmp %eax,%ecx 0x00000000004cdafa <+42>: jl 0x4cdae5 <foo+21> 0x00000000004cdafc <+44>: callq 0x4b1bc0 <asan::asan_report_store4(__sanitizer::uptr)> End of assembler dump. Dump of assembler code for function bar: 0x00000000004cdb10 <+0>: push %rbp 0x00000000004cdb11 <+1>: mov %rsp,%rbp 0x00000000004cdb14 <+4>: push %rbx 0x00000000004cdb15 <+5>: push %rax 0x00000000004cdb16 <+6>: mov %rdi,%rbx 0x00000000004cdb19 <+9>: callq 0x4cdad0 0x00000000004cdb1e <+14>: mov %rbx,%rdi 0x00000000004cdb21 <+17>: add $0x8,%rsp 0x00000000004cdb25 <+21>: pop %rbx 0x00000000004cdb26 <+22>: pop %rbp 0x00000000004cdb27 <+23>: jmpq 0x4cdad0 End of assembler dump. Dump of assembler code for function main: 0x00000000004cdab0 <+0>: push %rbp 0x00000000004cdab1 <+1>: mov %rsp,%rbp 0x00000000004cdab4 <+4>: xor %edi,%edi 0x00000000004cdab6 <+6>: callq 0x4cdb10 0x00000000004cdabb <+11>: xor %eax,%eax 0x00000000004cdabd <+13>: pop %rbp 0x00000000004cdabe <+14>: retq
End of assembler dump.

vitalybuka commented 6 years ago

Do we still want this?

kcc commented 9 years ago

I created http://reviews.llvm.org/D10065 and added llvm-commits to Subscribers.

I've commented on the patch... By I wonder if we could do simpler, by applying GetNextInstructionPc to the pc of the signal frame? (I have not tried it, just guessing)

llvmbot commented 9 years ago

I created http://reviews.llvm.org/D10065 and added llvm-commits to Subscribers.

kcc commented 9 years ago

Created attachment 14394 [details] Updated patch with a test.

The attached patch adds a test (accidentally left out of the previous patch).

Martin, we appreciate that you have prepared the patch! But please, send it to llvm-commits, preferably via http://llvm.org/docs/Phabricator.html

llvmbot commented 9 years ago

Updated patch with a test. The attached patch adds a test (accidentally left out of the previous patch).

llvmbot commented 9 years ago

Updated patch. Attached is an updated patch with the suggested changes. Tested on powerpc64, powerpc64le, and x86_64.

Using the unmodified PC value for the active stack frame is important when looking up the source line number that corresponds to the faulting instruction. When looking up line numbers for PCs in subsequent frames the PC value from the stack needs to be adjusted to point at the branch and link or call instruction as opposed to the instruction just after it so that the lookup finds the source line with the function call expression as users are expected to see when printing a stack trace in a debugger and not the next line as might otherwise be the case.

The above is only necessary for the line number lookup and not for the PC values printed by the sanitizer, which strictly speaking don't have to be the same as those printed by GDB. But the PC values should be valid and point at the first byte of either the branch or call instruction or the one just past it, and not into the middle of one. That said, printing PC values that are different from those printed by GDB would be inconsistent with the values obtained by the program by calling the __builtin_return_address intrinsic.

In addition to making sure the correct line number is looked up when the stack trace is printed in response to a signal, this patch also adjusts the code to print PC values that are consistent with GDB and the __builtin_return_address intrinsic.

llvmbot commented 9 years ago

Storing PC values that match those printed by GDB is necessary in order to find the correct line numbers in DWARF line programs. Without the change some of the line numbers are off. I don't know if this is an issue when using ASan with Clang but it is detected by some GCC tests for it. I mention this in comment 7 on http://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479.

kcc commented 9 years ago

I agree that we probably should not call GetPreviousInstructionPc() on the top stack frame entry,

When the top frame comes from asanreport* we still have to apply GetPreviousInstructionPc because asanreport is a call.

But for signals we should not apply it.

though, and make sure we point to the exact problematic instruction.

llvmbot commented 9 years ago

That's correct, ASan stack traces deliberately print a PC corresponding to call instruction, not the return address. I don't see why the behavior should be different for signals, or why we should match GDB.

I agree that we probably should not call GetPreviousInstructionPc() on the top stack frame entry, though, and make sure we point to the exact problematic instruction.

kcc commented 9 years ago

Ah, I missed your comment with the patch.

kcc commented 9 years ago

Yea... I guess that's because we apply StackTrace::GetPreviousInstructionPc too aggressively.

While we'd like to fix this eventually, it's unlikely we'll get to it soon enough. Would you like to contribute a fix (with tests)?