CTSRD-CHERI / cheribsd

FreeBSD adapted for CHERI-RISC-V and Arm Morello.
http://cheribsd.org
Other
168 stars 60 forks source link

backtrace CHERI tag violation #962

Open nwf opened 3 years ago

nwf commented 3 years ago

I am to the point that snmalloc compiles on CHERI again but am doing something wrong in startup. On its way out, it tries to generate a stack trace by calling backtrace_symbols_fd and that, in turn, dies a CHERI death:

Trapframe Register Dump:
$0: 0                  at: 0x1                v0: 0x41c8a100         v1: 0xfffffffffffffffc
a0: 0x41c8a100         a1: 0x100              a2: 0x3                a3: 0
a4: 0                  a5: 0x2                a6: 0x390              a7: 0x390
t0: 0x28               t1: 0x7000000d         t2: 0x22               t3: 0x22
s0: 0x5                s1: 0x5                s2: 0x2                s3: 0x25
s4: 0                  s5: 0x1                s6: 0                  s7: 0
t8: 0                  t9: 0                  k0: 0                  k1: 0
gp: 0                  sp: 0                  s8: 0                  ra: 0
status: 0x408084b3 mullo: 0xf0; mulhi: 0; badvaddr: 0x40b7b560
cause: 0x48; pc: 0x40b93a74
BadInstr: 0xd8c10008 clc        c6,zero,128(c1)
CHERI cause: ExcCode: 0x02 RegNum: $c01 (tag violation)
$ddc: 0x0000000000000000
$c01: 0x0000000041c8a100 [rwRW,0x0000000041c8a200-0x0000000041c8a2f0] (invalid)
$c02: 0x0000000041c8a3a0 [rwRW,0x0000000041c8a300-0x0000000041c8a3f0]
$c03: 0x0000000041c82080 [rwRW,0x0000000041c82000-0x0000000041c820d0]
$c04: 0x0000000041c8a300 [rwRW,0x0000000041c8a300-0x0000000041c8a3f0]
$c05: 0x0000000041c8a101 [rwRW,0x0000000041c8a200-0x0000000041c8a2f0] (invalid)
$c06: 0x0000000000000001 [rwRW,0xffffffffffffe200-0xffffffffffffe2f0] (invalid)
$c07: 0x0000000041c8a100 [rwRW,0x0000000041c8a100-0x0000000041c8a1f0]
$c08: 0x0000007ffffd9750 [rwRW,0x0000007ffffd9750-0x0000007ffffd9760]
$c09: 0x0000007ffffd9740 [rwRW,0x0000007ffffd9740-0x0000007ffffd9750]
$c10: 0x0000007ffffcdb80 [rwRW,0x0000007ffffcdb80-0x0000007ffffcdb90]
$c11: 0x0000007ffffda440 [rwRW,0x0000007ffbfe0000-0x0000007ffffe0000]
$c12: 0x0000000040b93a30 [rxR,0x0000000040b78000-0x0000000040bc9a00] (sentry)
$c13: 0x0000000000000000
$c14: 0x0000007ffffcdb50 [rwRW,0x0000007ffffcdb50-0x0000007ffffcdb60]
$c15: 0x0000007ffffcdb40 [rwRW,0x0000007ffffcdb40-0x0000007ffffcdb50]
$c16: 0x0000007ffffcdb30 [rwRW,0x0000007ffffcdb30-0x0000007ffffcdb40]
$c17: 0x0000000040b94470 [rxR,0x0000000040b78000-0x0000000040bc9a00] (sentry)
$c18: 0x0000000041c8a500 [rwRW,0x0000000041c8a500-0x0000000041c8a5f0]
$c19: 0x0000000041c82000 [rwRW,0x0000000041c82000-0x0000000041c820d0]
$c20: 0x0000000040bc8700 [rxR,0x0000000040b78000-0x0000000040bc9a00]
$c21: 0x0000000041c8a400 [rwRW,0x0000000041c8a400-0x0000000041c8a4f0]
$c22: 0x0000000040bc8700 [rxR,0x0000000040b78000-0x0000000040bc9a00]
$c23: 0x0000007ffffdfca0 [rwRW,0x0000007ffffdfca0-0x0000007ffffdfcb0]
$c24: 0x0000007ffffda440 [rwRW,0x0000007ffbfe0000-0x0000007ffffe0000]
$c25: 0x0000000000000000
$c26: 0x0000000040726910 [rR,0x0000000040726900-0x000000004073bd80]
$c27: 0x0000000040ba4638 [rxR,0x0000000040b78000-0x0000000040bc9a00]
$c28: 0x0000000000000000
$c29: 0x0000000000000000
$c30: 0x0000000000000000
$c31: 0x0000000000000000
$pcc: 0x0000000040b93a74 [rxR,0x0000000040b78000-0x0000000040bc9a00]
Feb 22 14:24:18 cheribsd-mips64-hybrid kernel: USER_CHERI_EXCEPTION: pid 723 tid 100053 (func-pagemap-1), uid 0: CP2 fault (type 0x32)
Feb 22 14:24:18 cheribsd-mips64-hybrid kernel: Process arguments: /mnt/snmalloc-mips64-purecap-build/func-pagemap-1

Program received signal SIGPROT, CHERI protection violation
Capability tag faultwarning: GDB can't find the start of the function at 0x40b93a74.

    GDB is unable to find the start of the function at 0x40b93a74
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
    This problem is most likely caused by an invalid program counter or
stack pointer.
    However, if you think GDB should simply search farther back
from 0x40b93a74 for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
 caused by register c1: 0x0000000041c8a100 [rwRW,0x41c8a200-0x41c8a2f0].
0x0000000040b93a74 in ?? ()
Reading symbols from /usr/libcheri/libexecinfo.so.1...
Reading symbols from /usr/libcheri/libc++.so.1...
Reading symbols from /usr/libcheri/libcxxrt.so.1...
Reading symbols from /usr/libcheri/libm.so.5...
Reading symbols from /usr/libcheri/libthr.so.3...
Reading symbols from /usr/libcheri/libc.so.7...
Reading symbols from /usr/libcheri/libelf.so.2...
Reading symbols from /usr/libcheri/libgcc_s.so.1...
Reading symbols from /libexec/ld-cheri-elf.so.1...

Thread 1 (LWP 100053 of process 723):
#0  0x0000000040b93a74 in scntree_RB_INSERT_COLOR (head=0x41c82080 [rwRW,0x41c82000-0x41c820d0], elm=0x41c8a300 [rwRW,0x41c8a300-0x41c8a3f0]) at /cheri/source/mainline/cheribsd/contrib/elftoolchain/libelf/elf_scn.c:52
#1  0x0000000040b94470 in scntree_RB_INSERT (head=0x41c82080 [rwRW,0x41c82000-0x41c820d0], elm=0x41c8a300 [rwRW,0x41c8a300-0x41c8a3f0]) at /cheri/source/mainline/cheribsd/contrib/elftoolchain/libelf/elf_scn.c:52
#2  0x0000000040b9abb8 in _libelf_allocate_scn (e=0x41c82000 [rwRW,0x41c82000-0x41c820d0], ndx=5) at /cheri/source/mainline/cheribsd/contrib/elftoolchain/libelf/libelf_allocate.c:162
#3  0x0000000040b94d44 in _libelf_load_section_headers (e=0x41c82000 [rwRW,0x41c82000-0x41c820d0], ehdr=<optimized out>) at /cheri/source/mainline/cheribsd/contrib/elftoolchain/libelf/elf_scn.c:121
#4  0x0000000040b95288 in elf_getscn (e=0x41c82000 [rwRW,0x41c82000-0x41c820d0], index=1) at /cheri/source/mainline/cheribsd/contrib/elftoolchain/libelf/elf_scn.c:162
#5  elf_nextscn (e=0x41c82000 [rwRW,0x41c82000-0x41c820d0], s=<optimized out>) at /cheri/source/mainline/cheribsd/contrib/elftoolchain/libelf/elf_scn.c:253
#6  0x00000000401e31dc in symtab_create (fd=3, bind=-1, type=2) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/symtab.c:116
#7  0x00000000401e2860 in backtrace_symbols_fmt (trace=0x7ffffdade0 [rwRW,0x7ffffdade0-0x7ffffdede0], len=9, fmt=0x401d1f24 [rR,0x401d1f24-0x401d1f34] "%a <%n%D> at %f") at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:197
#8  0x00000000401e2e44 in backtrace_symbols_fd_fmt (trace=0x41c82080 [rwRW,0x41c82000-0x41c820d0], len=9, fd=1, fmt=0x41c8a300 [rwRW,0x41c8a300-0x41c8a3f0] "") at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:238
#9  backtrace_symbols_fd (trace=0x41c82080 [rwRW,0x41c82000-0x41c820d0], len=9, fd=1) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:259
[... snip snmalloc internals probably not germane...]

ETA: Forgot to say, this is on 68bb19024845

arichardson commented 3 years ago

I haven't run the libunwind tests on RISC-V for a while. It's possible that they are currently broken.

arichardson commented 3 years ago

Ah nevermind, I see the crash is inside libelf trying to read the symbol table. I wouldn't be surprised if libelf has lots of memory safety issues.

nwf commented 3 years ago

Time has marched on but sadly this bug is still here. On the latest efforts to CHERIfy snmalloc on RISC-V, it now presents itself as

Program received signal SIGPROT, CHERI protection violation
Capability sealed fault caused by register ca1.
0x0000000040174e38 in symtab_find (st=0x40a67030 [rwRW,0x40a67030-0x40a67060], p=0x11c0aa <snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace()+92> [rxR,0x100000-0x13dd00] (sentry), dli=0x3fffdfb1a0 [rwRW,0x3fffdfb1a0-0x3fffdfb1e0]) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/symtab.c:188
188     /cheri/source/mainline/cheribsd/contrib/libexecinfo/symtab.c: No such file or directory.
(gdb) bt
#0  0x0000000040174e38 in symtab_find (st=0x40a67030 [rwRW,0x40a67030-0x40a67060], p=0x11c0aa <snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace()+92> [rxR,0x100000-0x13dd00] (sentry), dli=0x3fffdfb1a0 [rwRW,0x3fffdfb1a0-0x3fffdfb1e0]) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/symtab.c:188
#1  0x00000000401745ba in format_address (st=0x40a67030 [rwRW,0x40a67030-0x40a67060], buf=0x3fffdfad80 [,0xfdec000000000000-0xffffffffffffffff], bufsiz=0x3fffdfad78 [,0xfdec000000000000-0xffffffffffffffff], offs=80, fmt=0x40172e20 [rR,0x40172e20-0x40172e30] "%a <%n%D> at %f", 
    addr=0x11c0aa <snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace()+92> [rxR,0x100000-0x13dd00] (sentry)) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:175
#2  backtrace_symbols_fmt (trace=<optimized out>, len=5, fmt=<optimized out>) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:211
#3  0x00000000401748c0 in backtrace_symbols_fd_fmt (len=<optimized out>, fd=<optimized out>, fmt=0x3fffdfb1a0 [rwRW,0x3fffdfb1a0-0x3fffdfb1e0] " \377\337\277?", trace=<optimized out>) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:238
#4  backtrace_symbols_fd (trace=<optimized out>, len=<optimized out>, fd=<optimized out>) at /cheri/source/mainline/cheribsd/contrib/libexecinfo/backtrace.c:259
#5  0x000000000011c0e0 in snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace () at /cheri/source/mainline/snmalloc2/src/ds/../pal/pal_posix.h:154
#6  0x000000000011c046 in snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::error (str=0x3fffdff4e0 [rwRW,0x3fffdff4e0-0x3fffdff8e0] "memcpy with destination out of bounds of heap allocation: 0x40e04080 is in allocation 0x40e04080--0x40e040a0, offset 0x21 is past the end.\n")
    at /cheri/source/mainline/snmalloc2/src/ds/../pal/pal_posix.h:166
#7  0x000000000011bb28 in (anonymous namespace)::crashWithMessage (p=0x40e04080 [rwRW,0x40e04080-0x40e040a0], len=33, msg=0x10d1de [rR,0x10d1de-0x10d217] "memcpy with destination out of bounds of heap allocation", alloc=...) at /cheri/source/mainline/snmalloc2/src/override/memcpy.cc:120
#8  0x000000000011a938 in (anonymous namespace)::check_bounds<false> (ptr=0x40e04080 [rwRW,0x40e04080-0x40e040a0], len=33, msg=0x10d1de [rR,0x10d1de-0x10d217] "memcpy with destination out of bounds of heap allocation") at /cheri/source/mainline/snmalloc2/src/override/memcpy.cc:151
#9  my_memcpy (dst=0x40e04080 [rwRW,0x40e04080-0x40e040a0], src=0x40e04060 [rwRW,0x40e04060-0x40e04080], len=33) at /cheri/source/mainline/snmalloc2/src/override/memcpy.cc:218
#10 0x000000000011b1d0 in check_bounds (size=32, out_of_bounds=1) at /cheri/source/mainline/snmalloc2/src/test/func/memcpy/func-memcpy.cc:121
#11 0x000000000011b370 in main () at /cheri/source/mainline/snmalloc2/src/test/func/memcpy/func-memcpy.cc:150
(gdb) i r
ra             0x401745ba       1075267002
sp             0x3fffdfacf0     274875788528
gp             0x0      0
tp             0x40a10050       1084293200
t0             0xfff2   65522
t1             0x4017501c       1075269660
t2             0x40acc000       1085063168
fp             0x0      0
s1             0x3fffdfad78     274875788664
a0             0x40a67030       1084649520
a1             0x11c0aa 1163434
a2             0x3fffdfb1a0     274875789728
a3             0x184    388
a4             0x0      0
a5             0x11c04e 1163342
a6             0x100000 1048576
a7             0x1c0aa  114858
s2             0x63     99
s3             0xffffffffffffffff       -1
s4             0x40172e20       1075260960
s5             0x0      0
s6             0x50     80
s7             0x3fffdfb1a0     274875789728
s8             0x11c0aa 1163434
s9             0x3fffdfad80     274875788672
s10            0x40a67030       1084649520
s11            0x25     37
t3             0x40174e14       1075269140
t4             0xa      10
t5             0x1000   4096
t6             0x1      1
pc             0x40174e38       1075269176
cnull          0x0      0x0
cra            0xf11720000801880600000000401745ba       0x401745ba <backtrace_symbols_fmt+378> [rxR,0x40172000-0x40178000] (sentry)
csp            0xd17d000003ff2ffe0000003fffdfacf0       0x3fffdfacf0 [rwRW,0x3fbfe00000-0x3fffe00000]
cgp            0x0      0x0
ctp            0xd17d0000015d800e0000000040a10050       0x40a10050 [rwRW,0x40a10020-0x40a155c0]
ct0            0xfff2   0xfff2
ct1            0xf117200008018806000000004017501c       0x4017501c [rxR,0x40172000-0x40178000] (sentry)
ct2            0xd17d0000030599830000000040acc000       0x40acc000 [rwRW,0x40acc000-0x40b60800]
cfp            0x0      0x0
cs1            0xd17d00000761ad7c0000003fffdfad78       0x3fffdfad78 [rwRW,0x3fffdfad78-0x3fffdfad80]
ca0            0xd17d00000419b0340000000040a67030       0x40a67030 [rwRW,0x40a67030-0x40a67060]
ca1            0xd11720000bbb8001000000000011c0aa       0x11c0aa <snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace()+92> [rxR,0x100000-0x13dd00] (sentry)
ca2            0xd17d00000479b1a40000003fffdfb1a0       0x3fffdfb1a0 [rwRW,0x3fffdfb1a0-0x3fffdfb1e0]
ca3            0x184    0x184
ca4            0x0      0x0
ca5            0x3bb8001000000000011c04e        0x11c04e <snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace()> [,0x100000-0x13dd00]
ca6            0x3bb80010000000000100000        0x100000 [,0x100000-0x13dd00]
ca7            0x1c0aa  0x1c0aa
cs2            0x63     0x63
cs3            0xffffffffffffffff       0xffffffffffffffff
cs4            0xf1152000078dae240000000040172e20       0x40172e20 [rR,0x40172e20-0x40172e30]
cs5            0x0      0x0
cs6            0x50     0x50
cs7            0xd17d00000479b1a40000003fffdfb1a0       0x3fffdfb1a0 [rwRW,0x3fffdfb1a0-0x3fffdfb1e0]
cs8            0xd11720000bbb8001000000000011c0aa       0x11c0aa <snmalloc::PALPOSIX<snmalloc::PALFreeBSD>::print_stack_trace()+92> [rxR,0x100000-0x13dd00] (sentry)
cs9            0xd17d00000765ad840000003fffdfad80       0x3fffdfad80 [rwRW,0x3fffdfad80-0x3fffdfad90]
cs10           0xd17d00000419b0340000000040a67030       0x40a67030 [rwRW,0x40a67030-0x40a67060]
cs11           0x25     0x25
ct3            0xf1172000080188060000000040174e14       0x40174e14 <symtab_find> [rxR,0x40172000-0x40178000] (sentry)
ct4            0xa      0xa
ct5            0x1000   0x1000
ct6            0x1      0x1
pcc            0xf1172000000188060000000040174e38       0x40174e38 <symtab_find+36> [rxR,0x40172000-0x40178000]
ddc            0x0      0x0
cap_valid      0xb3bc90eb       3015479531
(gdb) x/i $pcc
=> 0x40174e38 <symtab_find+36>:     csetaddr        ct3,ca1,a7

which is to say that uintptr_t me = (uintptr_t)p - fbase does not go well when p is a sentry. I suspect that all of fbase, dd, sd, me, and ad should be ptraddr_t rather than uintptr_t, since it looks like that routine is just doing math and not actually anything with pointers/capabilities.

jrtc27 commented 3 years ago

Ah yeah, this is a repeat of a similar thing I saw in LLVM's stack trace implementation