Closed rwatson closed 8 years ago
(Identical crashes occur on both Qemu-CHERI and a recent 256-bit bitfile, so this is not obviously an emulation/ISA-level problem.)
Sadly, or perhaps happily, shifting to -O1
:
diff --git a/share/mk/bsd.cheri.mk b/share/mk/bsd.cheri.mk
index 08589f47..ae2eef1 100644
--- a/share/mk/bsd.cheri.mk
+++ b/share/mk/bsd.cheri.mk
@@ -31,7 +31,7 @@ OBJCOPY:= elfcopy
_CHERI_CC+= -mabi=sandbox -mxgot
LIBDIR:= /usr/libcheri
ROOTOBJDIR= ${.OBJDIR:S,${.CURDIR},,}${SRCTOP}/worldcheri${SRCTOP}
-CFLAGS+= -O2 -ftls-model=local-exec
+CFLAGS+= -O1 -ftls-model=local-exec
.if ${MK_CHERI_LINKER} == "yes"
_CHERI_CC+= -cheri-linker
CFLAGS+= -Wno-error
keeps what appears to be the same crash:
root@:~ # file /bin/cat
CHERI cause: ExcCode: 0x01 RegNum: $c18 (length violation)
$c00: v:1 s:0 p:7fff807d b:0000000000000000 l:0000010000000000 o:0 t:0
$c01: v:0 s:0 p:00000000 b:0000000000000000 l:0000000000000000 o:0 t:0
$c02: v:1 s:0 p:7fff807d b:0000000000000000 l:0000010000000000 o:12015a580 t:0
$c03: v:1 s:0 p:7fff807d b:000000012015a580 l:0000000000000160 o:0 t:0
$c04: v:1 s:0 p:7fff807d b:0000000000c00000 l:0000000000a00000 o:0 t:0
$c05: v:1 s:0 p:7fff807d b:0000000000000000 l:0000010000000000 o:41e000 t:0
$c06: v:1 s:0 p:7fff807d b:0000007fffffcb7c l:0000000000000001 o:0 t:0
$c07: v:1 s:0 p:7fff807d b:0000007fffffd91c l:0000000000000004 o:0 t:0
$c08: v:1 s:0 p:7fff807d b:0000007fffffd918 l:0000000000000004 o:0 t:0
$c09: v:1 s:0 p:7fff807d b:0000007fffffd79c l:0000000000000004 o:0 t:0
$c10: v:0 s:0 p:00000000 b:0000000000000000 l:0000000000000000 o:0 t:0
$c11: v:1 s:0 p:7fff807d b:0000007fff7ff000 l:00000000007ff3a0 o:0 t:0
$c12: v:1 s:0 p:7fff8017 b:0000000000000000 l:0000010000000000 o:1200b6bc8 t:0
$c13: v:1 s:0 p:00008055 b:0000007fffffd220 l:0000000000000040 o:0 t:0
$c14: v:0 s:0 p:00000000 b:0000000000000000 l:0000000000000000 o:0 t:0
$c15: v:0 s:0 p:00000000 b:0000000000000000 l:0000000000000000 o:0 t:0
$c16: v:0 s:0 p:00000000 b:0000000000000000 l:0000000000000000 o:0 t:0
$c17: v:1 s:0 p:7fff8017 b:0000000000000000 l:0000010000000000 o:1200b7240 t:0
$c18: v:1 s:0 p:7fff807d b:000000012015a580 l:0000000000000160 o:0 t:0
$c19: v:1 s:0 p:7fff807d b:0000000000c00000 l:0000000000a00000 o:0 t:0
$c20: v:1 s:0 p:7fff807d b:0000000000000000 l:0000010000000000 o:41e000 t:0
$c21: v:1 s:0 p:7fff807d b:0000000000000000 l:0000010000000000 o:0 t:0
$c22: v:1 s:0 p:7fff807d b:0000000000a186c0 l:0000000000140000 o:0 t:0
$c23: v:1 s:0 p:7fff807d b:0000007fffffed05 l:0000000000000009 o:0 t:0
$c24: v:1 s:0 p:7fff807d b:000000000043f000 l:00000000000001c0 o:0 t:0
$c26: v:1 s:0 p:7fff807d b:0000000000000000 l:0000010000000000 o:0 t:0
$c31: v:1 s:0 p:7fff8017 b:0000000000000000 l:0000010000000000 o:1200b7254 t:0
Jul 18 12:26:43 kernel: USER_CHERI_EXCEPTION: pid 546 tid 100046 (file), uid 0: CP2 fault (type 0x32)
Jul 18 12:26:43 kernel: Trapframe Register Dump:
Jul 18 12:26:43 kernel: zero: 0 at: 0x3ffffffffffffffc v0: 0x3ffffffffffffffc v1: 0x2222222222222222
Jul 18 12:26:43 kernel: a0: 0 a1: 0xa a2: 0x64 a3: 0x1
Jul 18 12:26:43 kernel: a4: 0x101010101010101 a5: 0x8000000000000000 a6: 0x3f38302820181008 a6: 0x23
Jul 18 12:26:43 kernel: t0: 0x1b t1: 0 t2: 0 t3: 0
Jul 18 12:26:43 kernel: t8: 0xa t9: 0x1200b71f0 s0: 0x1 s1: 0x120149050
Jul 18 12:26:43 kernel: s2: 0x120149050 s3: 0x3 s4: 0 s5: 0
Jul 18 12:26:43 kernel: s6: 0xc8 s7: 0x1200e9078 k0: 0 k1: 0
Jul 18 12:26:43 kernel: gp: 0x120149050 sp: 0x7fe660 s8: 0x7fe660 ra: 0x8
Jul 18 12:26:43 kernel: sr: 0x408084b3 mullo: 0x4038302820181008 mulhi: 0x8101820283038 badvaddr: 0x1200b7254
Jul 18 12:26:43 kernel: cause: 0x48 pc: 0x1200b7254
Signal 34 (core dumped)
Although the code is compiled pretty differently -- perhaps more readably:
00000001200b71f0 <__je_rtree_start_level>:
#endif
#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_RTREE_C_))
JEMALLOC_INLINE unsigned
rtree_start_level(rtree_t *rtree, uintptr_t key)
{
1200b71f0: 67bdffa0 daddiu sp,sp,-96
1200b71f4: ebcbe85b csd s8,sp,88(c11)
1200b71f8: fa4be802 csc c18,sp,32(c11)
1200b71fc: fa2be800 csc c17,sp,0(c11)
1200b7200: 03a0f025 move s8,sp
1200b7204: 49b96002 cgetoffset t9,c12
1200b7208: 3c010009 lui at,0x9
1200b720c: 0039102d daddu v0,at,t9
1200b7210: 48810007 cfromptr c1,c0,zero
unsigned start_level;
if (unlikely(key == 0))
1200b7214: 49a10801 csetoffset c1,c1,zero
1200b7218: 49c10901 cne at,c1,c4
1200b721c: 10200014 beqz at,1200b7270 <__je_rtree_start_level+0x8
0>
1200b7220: 49b21800 cmove c18,c3
1200b7224: 64411e60 daddiu at,v0,7776
return (rtree->height - 1);
start_level = rtree->start_level[lg_floor(key) >>
1200b7228: 3c020000 lui v0,0x0
1200b722c: 0041082d daddu at,v0,at
1200b7230: dc21b7e8 ld at,-18456(at)
1200b7234: 480c09ff cgetpccsetoffset c12,at
1200b7238: 48f16000 cjalr c12,c17
1200b723c: 49a42002 cgetoffset a0,c4
1200b7240: 000208ba dsrl at,v0,0x2
1200b7244: 64020001 daddiu v0,zero,1
1200b7248: 000217bc dsll32 v0,v0,0x1e
1200b724c: 6442fffc daddiu v0,v0,-4
1200b7250: 00220824 and at,at,v0
1200b7254: c852088e clw v0,at,68(c18)
LG_RTREE_BITS_PER_LEVEL];
assert(start_level < rtree->height);
return (start_level);
}
1200b7258: 03c0e825 move sp,s8
1200b725c: da2be800 clc c17,sp,0(c11)
1200b7260: da4be802 clc c18,sp,32(c11)
1200b7264: cbcbe85b cld s8,sp,88(c11)
1200b7268: 49008800 cjr c17
1200b726c: 67bd0060 daddiu sp,sp,96
rtree_start_level(rtree_t *rtree, uintptr_t key)
{
unsigned start_level;
if (unlikely(key == 0))
return (rtree->height - 1);
1200b7270: c8320086 clw at,zero,64(c18)
1200b7274: 0802dc96 j 1200b7258 <__je_rtree_start_level+0x68>
1200b7278: 2422ffff addiu v0,at,-1
1200b727c: 00000000 nop
That's good news. The optimisations that run at -O1
are not as complex (and, most happily, don't include GVN).
Do you have a trace of this that shows what lg_floor
returns?
I believe that this might not actually be a compiler bug. It calls lg_floor(key)
, where key
is a uintptr_t
. This function attempts to count the leading zeros, but doesn't appear to have a case to deal with capabilities. If we ran the upstream autoconf
stuff then I think that we'd see a compile failure because the macros for defining the size of a pointer would be set to something different from the ones that define the size of an integer or a long.
It looks like lg_floor()
is being called with an implicit cast of key
to size_t
which (assuming c4
is the pre-lg_floor()
value should trigger an assert()
so I'll need to look at a trace. I'll do that after I investigate a newly arisen crash in kyua.
LG_SIZEOF_PTR
is a bit of an oddity in the jemalloc code. Except for the place where it is (rather pointlessly) used to define SIZEOF_PTR
it's used as something that would be better spelt LG_RANGEOF_PTR
.
root@beri1:~ # ~ctsrd/file ~ctsrd/file
/home/ctsrd/file: ELF 64-bit MSB executable, MIPS with CHERI (unofficial), version 1 (FreeBSD), statically linked, for FreeBSD 11.0 (1100097), not stripped
In trying to run a statically linked CheriABI
file(1)
:I encounter the following crash in
jemalloc
:A load is attempted relative to a capability with a length of 0x160, but the requested register offset from that is 0x3ffffffffffffffc.
From Qemu-CHERI's tracing facility:
$a1, loaded via $gp, is 0x120164380, and points here (in BSS or similar):
The derivation of $at is a bit painful.
Some disassembled code around this trace:
Unfortunately I am unable to compile
libc
at-O0
due to a linker failure, so cannot easily compare results there:(Also potentially of interest to @brooksdavis.)