Closed dvyukov closed 7 years ago
Hi -
On 2017-10-16 at 10:09 Dmitry Vyukov wrote:
I am getting the following crashes. Is it a know issue? If not and you don't see why it happens right away, I can try to create a reproducer.
I don't recognize this one. It looks like there are a couple issues.
I'd need to see a little of the ASM for the GP faults to know why userspace is faulting.
The kernel panic is also interesting. My guess is a refcounting problem. I might be able to figure it out from a backtrace ('bt' from the monitor) and the value of *page->pg_tree_slot. But a reproducer might be needed.
Here is the backtrace:
Stack Backtrace on Core 0:
#01 [<0xffffffffc2016024>] in mon_backtrace
#02 [<0xffffffffc2017127>] in monitor
#03 [<0xffffffffc200cbea>] in _panic
#04 [<0xffffffffc2044c9b>] in pm_load_page
#05 [<0xffffffffc205bba1>] in generic_file_read
#06 [<0xffffffffc2006d75>] in is_valid_elf
#07 [<0xffffffffc205602d>] in sys_exec
#08 [<0xffffffffc2056499>] in syscall
#09 [<0xffffffffc2056654>] in run_local_syscall
#10 [<0xffffffffc20a231a>] in sysenter_callwrapper
There is no source/line info in obj/kern/akaros-kernel, so I can't map this to lines.
I failed to create a C reproducer. If I am reading this correctly, sys_exec is exec system call. Fuzzer itself does not call exec. So I wonder what calls exec. This probably explains why I can't create a standalone repro. Do you see from crash message what is the process that caused the panic?
bash-4.3$
bash-4.3$ Unhandled user trap in vcore context from VC 5
HW TRAP frame (partial) at 0xffffffffc82cc720 on core 5
rax 0x0000100000011740
rbx 0x000030000005cec0
rcx 0x0000000000000001
rdx 0x0000100000011740
rbp 0x000030000005cea0
rsi 0x000010000000f720
rdi 0x000010000000f720
r8 0x0000000000000000
r9 0x0000000000000000
r10 0x000030000005cec0
r11 0x0000000000000200
r12 0x0000000000000001
r13 0x0000000000000005
r14 0x00000000004095d0
r15 0x0000000000000000
trap 0x0000000e Page Fault
gsbs 0x0000000000000000
fsbs 0x0000000000000000
err 0x--------00000006
rip 0x0000000000400fff
cs 0x------------0023
flag 0x0000000000010a86
rsp 0x000030000005ce88
ss 0x------------001b
err 0x6 (for PFs: User 4, Wr 2, Rd 1), aux 0x00002fffeb89ce88
Addr 0x0000000000400fff is in syz-executor at offset 0x0000000000000fff
VM Regions for proc 44
NR: Range: Prot, Flags, File, Off
00: (0x0000000000400000 - 0x00000000004b2000): 0x00000005, 0x00000001, 0xffff800100c86420, 0x0000000000000000
01: (0x00000000004b2000 - 0x00000000004b3000): 0x00000005, 0x00000002, 0xffff800100c86420, 0x00000000000b2000
02: (0x00000000006b3000 - 0x00000000006b6000): 0x00000003, 0x00000002, 0xffff800100c86420, 0x00000000000b3000
03: (0x00000000006b6000 - 0x0000000000925000): 0x00000003, 0x00000002, 0x0000000000000000, 0x0000000000000000
04: (0x0000100000000000 - 0x0000100000024000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
05: (0x0000300000000000 - 0x0000300000001000): 0x00000003, 0x00000002, 0xffff800100c86420, 0x0000000000000000
06: (0x0000300000001000 - 0x0000300000005000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
07: (0x0000300000005000 - 0x0000300000007000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
08: (0x0000300000007000 - 0x0000300000031000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
09: (0x0000300000031000 - 0x000030000005d000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
10: (0x00007f7fff8ff000 - 0x00007f7fff9ff000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
Backtrace of user context on Core 5:
Offsets only matter for shared libraries
#01 Addr 0x0000000000400fff is in syz-executor at offset 0x0000000000000fff
#02 Addr 0x0000000000410444 is in syz-executor at offset 0x0000000000010444
#03 Addr 0x000000000040900c is in syz-executor at offset 0x000000000000900c
#04 Addr 0x0000000000415709 is in syz-executor at offset 0x0000000000015709
#05 Addr 0x0000000000401756 is in syz-executor at offset 0x0000000000001756
#06 Addr 0x0000100000011740's VMR has no file
Unhandled user trap in vcore context from VC 3
HW TRAP frame (partial) at 0xffffffffc82cc4a0 on core 4
rax 0x0000100000005d00
rbx 0x00007f7fff9feb80
rcx 0x0000000000000002
rdx 0x0000100000005d00
rbp 0x00007f7fff9feb60
rsi 0x000010000000bfa0
rdi 0x000010000000bfa0
r8 0x0000000000000000
r9 0x0000000000000000
r10 0x00007f7fff9feb80
r11 0x0000000000000200
r12 0x0000000000000001
r13 0x0000000000000003
r14 0x00000000004097d0
r15 0x0000015f265bbbf3
trap 0x0000000e Page Fault
gsbs 0x0000000000000000
fsbs 0x0000000000000000
err 0x--------00000006
rip 0x0000000000400fff
cs 0x------------0023
flag 0x0000000000010202
rsp 0x00007f7fff9feb48
ss 0x------------001b
err 0x6 (for PFs: User 4, Wr 2, Rd 1), aux 0x00007f7feb23eb48
Addr 0x0000000000400fff is in syz-executor at offset 0x0000000000000fff
VM Regions for proc 44
NR: Range: Prot, Flags, File, Off
00: (0x0000000000400000 - 0x00000000004b2000): 0x00000005, 0x00000001, 0xffff800100c86420, 0x0000000000000000
01: (0x00000000004b2000 - 0x00000000004b3000): 0x00000005, 0x00000002, 0xffff800100c86420, 0x00000000000b2000
02: (0x00000000006b3000 - 0x00000000006b6000): 0x00000003, 0x00000002, 0xffff800100c86420, 0x00000000000b3000
03: (0x00000000006b6000 - 0x0000000000925000): 0x00000003, 0x00000002, 0x0000000000000000, 0x0000000000000000
04: (0x0000100000000000 - 0x0000100000024000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
05: (0x0000300000000000 - 0x0000300000001000): 0x00000003, 0x00000002, 0xffff800100c86420, 0x0000000000000000
06: (0x0000300000001000 - 0x0000300000005000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
07: (0x0000300000005000 - 0x0000300000007000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
08: (0x0000300000007000 - 0x0000300000031000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
09: (0x0000300000031000 - 0x000030000005d000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
10: (0x00007f7fff8ff000 - 0x00007f7fff9ff000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
Backtrace of user context on Core 4:
Offsets only matter for shared libraries
#01 Addr 0x0000000000400fff is in syz-executor at offset 0x0000000000000fff
#02 Addr 0x0000000000410444 is in syz-executor at offset 0x0000000000010444
#03 Addr 0x0000000000409182 is in syz-executor at offset 0x0000000000009182
#04 Addr 0x0000000000415a01 is in syz-executor at offset 0x0000000000015a01
#05 Addr 0x0000000000401846 is in syz-executor at offset 0x0000000000001846
Unhandled user trap in vcore context from VC 0
HW TRAP frame (partial) at 0xffffffffc82cbaa0 on core 0
rax 0x0000100000005d00
rbx 0x00007f7fff9feaf0
rcx 0x000000000043699e
rdx 0x0000100000005d00
rbp 0x00007f7fff9fead0
rsi 0x00001000000046c0
rdi 0x00001000000046c0
r8 0x0000000000000000
r9 0x0000000000000000
r10 0x00007f7fff9feaf0
r11 0x0000000000000200
r12 0x0000000000000001
r13 0x0000000000000000
r14 0x00000000004154b0
r15 0x0000000000000000
trap 0x0000000e Page Fault
gsbs 0x0000000000000000
fsbs 0x0000000000000000
err 0x--------00000006
rip 0x0000000000400fff
cs 0x------------0023
flag 0x0000000000010202
rsp 0x00007f7fff9feab8
ss 0x------------001b
err 0x6 (for PFs: User 4, Wr 2, Rd 1), aux 0x00007f7feb23eab8
Addr 0x0000000000400fff is in syz-executor at offset 0x0000000000000fff
VM Regions for proc 30
NR: Range: Prot, Flags, File, Off
00: (0x0000000000400000 - 0x00000000004b2000): 0x00000005, 0x00000001, 0xffff800100c86420, 0x0000000000000000
01: (0x00000000004b2000 - 0x00000000004b3000): 0x00000005, 0x00000002, 0xffff800100c86420, 0x00000000000b2000
02: (0x00000000006b3000 - 0x00000000006b6000): 0x00000003, 0x00000002, 0xffff800100c86420, 0x00000000000b3000
03: (0x00000000006b6000 - 0x0000000000925000): 0x00000003, 0x00000002, 0x0000000000000000, 0x0000000000000000
04: (0x0000100000000000 - 0x0000100000024000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
05: (0x0000300000000000 - 0x0000300000001000): 0x00000003, 0x00000002, 0xffff800100c86420, 0x0000000000000000
06: (0x0000300000001000 - 0x0000300000005000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
07: (0x0000300000005000 - 0x0000300000007000): 0x00000007, 0x00000022, 0x0000000000000000, 0x0000000000000000
08: (0x0000300000007000 - 0x0000300000019000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
09: (0x00007f7fff8ff000 - 0x00007f7fff9ff000): 0x00000003, 0x00000022, 0x0000000000000000, 0x0000000000000000
Backtrace of user context on Core 0:
Offsets only matter for shared libraries
#01 Addr 0x0000000000400fff is in syz-executor at offset 0x0000000000000fff
#02 Addr 0x0000000000410444 is in syz-executor at offset 0x0000000000010444
#03 Addr 0x00000000004369b9 is in syz-executor at offset 0x00000000000369b9
#04 Addr 0x0000000000435ea6 is in syz-executor at offset 0x0000000000035ea6
#05 Addr 0x00000000004019c9 is in syz-executor at offset 0x00000000000019c9
#06 Addr 0x0000000000000000 has no VMR
kernel panic at kern/src/pagemap.c:222, from core 0: assertion failed: page && pm_slot_check_refcnt(*page->pg_tree_slot)
Entering Nanwan's Dungeon on Core 0 (Ints on):
Type 'help' for a list of commands.
ROS(Core 0)>
ROS(Core 0)>
ROS(Core 0)> bt
Stack Backtrace on Core 0:
#01 [<0xffffffffc2016024>] in mon_backtrace
#02 [<0xffffffffc2017127>] in monitor
#03 [<0xffffffffc200cbea>] in _panic
#04 [<0xffffffffc2044c9b>] in pm_load_page
#05 [<0xffffffffc205bba1>] in generic_file_read
#06 [<0xffffffffc2006d75>] in is_valid_elf
#07 [<0xffffffffc205602d>] in sys_exec
#08 [<0xffffffffc2056499>] in syscall
#09 [<0xffffffffc2056654>] in run_local_syscall
#10 [<0xffffffffc20a231a>] in sysenter_callwrapper
ROS(Core 0)> ps
PID Name State Parent
-------------------------------------------------
13 /bin/ipconfig WAITING 0
47 sh RUNNING_S 46
22 dropbear WAITING 1
45 dropbear WAITING 22
46 sh WAITING 45
23 bash WAITING 1
1 bash WAITING 0
19 /bin/cs WAITING 0
This can be reproduced by running whole fuzzer, though. Repro steps: Assuming linux/amd64 host. Download Go1.9.1 toolchain from https://golang.org/dl/ Unpack to some local dir, set GOROOT to that dir.
$ go get github.com/google/syzkaller
$ cd ~/go/src/github.com/google/syzkaller
$ git checkout 8793f74c6cb46d87b53758c6d99705b8018ceeba # current HEAD
$ make stress
$ make executor TARGETOS=akaros SOURCEDIR=/bootstrapped/akaros/checkout
# assuming you have ssh with a key setup
$ scp -P 5555 -i akaros_id_rsa -o IdentitiesOnly=yes ./bin/akaros_amd64/syz-executor root@localhost:/
$ bin/linux_amd64/syz-stress -os=akaros -arch=amd64 -timeout=5s -executor "/usr/bin/ssh -p 5555 -i akaros_id_rsa -o IdentitiesOnly=yes root@localhost /syz-executor"
This crashes kernel in <10 seconds for me.
On 2017-10-16 at 18:09 Dmitry Vyukov notifications@github.com wrote:
I failed to create a C reproducer. If I am reading this correctly, sys_exec is exec system call. Fuzzer itself does not call exec. So I wonder what calls exec. This probably explains why I can't create a standalone repro. Do you see from crash message what is the process that caused the panic?
It looks like sh, since it was the process running at the time. Do you have a bash script of some sort running to drive syz-executor?
As far as the backtrace goes, you can use:
addr2line -e obj/kern/akaros-kernel-64
Though in this case, it won't help much - I can see the codepath regardless of line numbers. It looks like we're just failing to read a file in generic_file_read().
On 2017-10-16 at 18:17 Dmitry Vyukov notifications@github.com wrote:
This can be reproduced by running whole fuzzer, though.
Thanks, I'll take a look.
What's strange is that all crashes mention RIP=0x0000000000400fff. But it does not point to any instruction in the binary. It's not mine init_cacheinfo function called somewhere from pthread. And it's also last byte of the first page of text section (which is paged in for the first time?). Maybe it rings any bells for you.
Disassembly of section .text:
0000000000400d90 <init_cacheinfo>:
400d90: 55 push %rbp
400d91: 48 89 e5 mov %rsp,%rbp
400d94: 41 56 push %r14
400d96: 41 55 push %r13
400d98: 41 54 push %r12
400d9a: 45 31 e4 xor %r12d,%r12d
400d9d: 53 push %rbx
400d9e: 44 89 e0 mov %r12d,%eax
400da1: 0f a2 cpuid
400da3: 81 f9 6e 74 65 6c cmp $0x6c65746e,%ecx
400da9: 41 89 c4 mov %eax,%r12d
400dac: 40 0f 94 c6 sete %sil
400db0: 81 fb 47 65 6e 75 cmp $0x756e6547,%ebx
400db6: 0f 94 c0 sete %al
400db9: 40 84 c6 test %al,%sil
400dbc: 74 08 je 400dc6 <init_cacheinfo+0x36>
400dbe: 81 fa 69 6e 65 49 cmp $0x49656e69,%edx
400dc4: 74 2b je 400df1 <init_cacheinfo+0x61>
400dc6: 81 f9 63 41 4d 44 cmp $0x444d4163,%ecx
400dcc: 0f 94 c1 sete %cl
400dcf: 81 fb 41 75 74 68 cmp $0x68747541,%ebx
400dd5: 0f 94 c0 sete %al
400dd8: 84 c1 test %al,%cl
400dda: 74 0c je 400de8 <init_cacheinfo+0x58>
400ddc: 81 fa 65 6e 74 69 cmp $0x69746e65,%edx
400de2: 0f 84 0e 01 00 00 je 400ef6 <init_cacheinfo+0x166>
400de8: 5b pop %rbx
400de9: 41 5c pop %r12
400deb: 41 5d pop %r13
400ded: 41 5e pop %r14
400def: 5d pop %rbp
400df0: c3 retq
400df1: 44 89 e6 mov %r12d,%esi
400df4: bf bc 00 00 00 mov $0xbc,%edi
400df9: e8 d2 39 03 00 callq 4347d0 <handle_intel>
400dfe: bf c2 00 00 00 mov $0xc2,%edi
400e03: 44 89 e6 mov %r12d,%esi
400e06: 49 89 c6 mov %rax,%r14
400e09: e8 c2 39 03 00 callq 4347d0 <handle_intel>
400e0e: 48 85 c0 test %rax,%rax
400e11: 49 89 c5 mov %rax,%r13
400e14: bf 03 00 00 00 mov $0x3,%edi
400e19: 0f 8e f7 01 00 00 jle 401016 <init_cacheinfo+0x286>
400e1f: b8 01 00 00 00 mov $0x1,%eax
400e24: 0f a2 cpuid
400e26: 81 e1 00 02 00 00 and $0x200,%ecx
400e2c: 89 de mov %ebx,%esi
400e2e: 83 f9 01 cmp $0x1,%ecx
400e31: 19 c0 sbb %eax,%eax
400e33: 83 c0 03 add $0x3,%eax
400e36: 41 83 fc 03 cmp $0x3,%r12d
400e3a: 89 05 c8 34 52 00 mov %eax,0x5234c8(%rip) # 924308 <__x86_preferred_memory_instruction>
400e40: 7e 2a jle 400e6c <init_cacheinfo+0xdc>
400e42: 31 c9 xor %ecx,%ecx
400e44: 41 b8 04 00 00 00 mov $0x4,%r8d
400e4a: eb 13 jmp 400e5f <init_cacheinfo+0xcf>
400e4c: 89 c2 mov %eax,%edx
400e4e: 44 89 c9 mov %r9d,%ecx
400e51: c1 ea 05 shr $0x5,%edx
400e54: 83 e2 07 and $0x7,%edx
400e57: 39 fa cmp %edi,%edx
400e59: 0f 84 5e 01 00 00 je 400fbd <init_cacheinfo+0x22d>
400e5f: 44 8d 49 01 lea 0x1(%rcx),%r9d
400e63: 44 89 c0 mov %r8d,%eax
400e66: 0f a2 cpuid
400e68: a8 1f test $0x1f,%al
400e6a: 75 e0 jne 400e4c <init_cacheinfo+0xbc>
400e6c: c1 ee 10 shr $0x10,%esi
400e6f: 40 0f b6 f6 movzbl %sil,%esi
400e73: 85 f6 test %esi,%esi
400e75: 74 10 je 400e87 <init_cacheinfo+0xf7>
400e77: 4d 85 ed test %r13,%r13
400e7a: 7e 0b jle 400e87 <init_cacheinfo+0xf7>
400e7c: 4c 89 e8 mov %r13,%rax
400e7f: 48 99 cqto
400e81: 48 f7 fe idiv %rsi
400e84: 49 89 c5 mov %rax,%r13
400e87: 4d 85 f6 test %r14,%r14
400e8a: 7e 2c jle 400eb8 <init_cacheinfo+0x128>
400e8c: 4c 89 f0 mov %r14,%rax
400e8f: 4c 89 35 9a 41 2b 00 mov %r14,0x2b419a(%rip) # 6b5030 <__x86_raw_data_cache_size>
400e96: 41 80 e6 00 and $0x0,%r14b
400e9a: 48 d1 f8 sar %rax
400e9d: 4c 89 35 9c 41 2b 00 mov %r14,0x2b419c(%rip) # 6b5040 <__x86_data_cache_size>
400ea4: 48 89 05 8d 41 2b 00 mov %rax,0x2b418d(%rip) # 6b5038 <__x86_raw_data_cache_size_half>
400eab: 4c 89 f0 mov %r14,%rax
400eae: 48 d1 f8 sar %rax
400eb1: 48 89 05 90 41 2b 00 mov %rax,0x2b4190(%rip) # 6b5048 <__x86_data_cache_size_half>
400eb8: 4d 85 ed test %r13,%r13
400ebb: 0f 8e 27 ff ff ff jle 400de8 <init_cacheinfo+0x58>
400ec1: 4c 89 e8 mov %r13,%rax
400ec4: 4c 89 2d 45 41 2b 00 mov %r13,0x2b4145(%rip) # 6b5010 <__x86_raw_shared_cache_size>
400ecb: 41 80 e5 00 and $0x0,%r13b
400ecf: 48 d1 f8 sar %rax
400ed2: 4c 89 2d 47 41 2b 00 mov %r13,0x2b4147(%rip) # 6b5020 <__x86_shared_cache_size>
400ed9: 48 89 05 38 41 2b 00 mov %rax,0x2b4138(%rip) # 6b5018 <__x86_raw_shared_cache_size_half>
400ee0: 4c 89 e8 mov %r13,%rax
400ee3: 48 d1 f8 sar %rax
400ee6: 5b pop %rbx
400ee7: 48 89 05 3a 41 2b 00 mov %rax,0x2b413a(%rip) # 6b5028 <__x86_shared_cache_size_half>
400eee: 41 5c pop %r12
400ef0: 41 5d pop %r13
400ef2: 41 5e pop %r14
400ef4: 5d pop %rbp
400ef5: c3 retq
400ef6: bf bc 00 00 00 mov $0xbc,%edi
400efb: e8 f0 39 03 00 callq 4348f0 <handle_amd>
400f00: bf bf 00 00 00 mov $0xbf,%edi
400f05: 49 89 c6 mov %rax,%r14
400f08: e8 e3 39 03 00 callq 4348f0 <handle_amd>
400f0d: bf c2 00 00 00 mov $0xc2,%edi
400f12: 49 89 c5 mov %rax,%r13
400f15: e8 d6 39 03 00 callq 4348f0 <handle_amd>
400f1a: 41 b8 01 00 00 00 mov $0x1,%r8d
400f20: 48 89 c7 mov %rax,%rdi
400f23: be 00 00 00 80 mov $0x80000000,%esi
400f28: 44 89 c0 mov %r8d,%eax
400f2b: 0f a2 cpuid
400f2d: c1 e1 16 shl $0x16,%ecx
400f30: 89 f0 mov %esi,%eax
400f32: c1 f9 1f sar $0x1f,%ecx
400f35: 83 e1 03 and $0x3,%ecx
400f38: 89 0d ca 33 52 00 mov %ecx,0x5233ca(%rip) # 924308 <__x86_preferred_memory_instruction>
400f3e: 0f a2 cpuid
400f40: 48 85 ff test %rdi,%rdi
400f43: 89 c6 mov %eax,%esi
400f45: 7e 2c jle 400f73 <init_cacheinfo+0x1e3>
400f47: 3d 07 00 00 80 cmp $0x80000007,%eax
400f4c: 76 54 jbe 400fa2 <init_cacheinfo+0x212>
400f4e: be 08 00 00 80 mov $0x80000008,%esi
400f53: 89 f0 mov %esi,%eax
400f55: 0f a2 cpuid
400f57: c1 e9 0c shr $0xc,%ecx
400f5a: 89 c6 mov %eax,%esi
400f5c: 83 e1 0f and $0xf,%ecx
400f5f: 41 d3 e0 shl %cl,%r8d
400f62: 44 89 c1 mov %r8d,%ecx
400f65: 48 89 f8 mov %rdi,%rax
400f68: 48 99 cqto
400f6a: 48 f7 f9 idiv %rcx
400f6d: 48 89 c7 mov %rax,%rdi
400f70: 49 01 fd add %rdi,%r13
400f73: 81 fe 00 00 00 80 cmp $0x80000000,%esi
400f79: 0f 86 08 ff ff ff jbe 400e87 <init_cacheinfo+0xf7>
400f7f: b8 01 00 00 80 mov $0x80000001,%eax
400f84: 0f a2 cpuid
400f86: 80 e5 01 and $0x1,%ch
400f89: 75 08 jne 400f93 <init_cacheinfo+0x203>
400f8b: 85 d2 test %edx,%edx
400f8d: 0f 89 f4 fe ff ff jns 400e87 <init_cacheinfo+0xf7>
400f93: c7 05 6f 33 52 00 ff movl $0xffffffff,0x52336f(%rip) # 92430c <__x86_prefetchw>
400f9a: ff ff ff
400f9d: e9 e5 fe ff ff jmpq 400e87 <init_cacheinfo+0xf7>
400fa2: 44 89 c0 mov %r8d,%eax
400fa5: 0f a2 cpuid
400fa7: 81 e2 00 00 00 10 and $0x10000000,%edx
400fad: 89 c6 mov %eax,%esi
400faf: 74 bf je 400f70 <init_cacheinfo+0x1e0>
400fb1: c1 eb 10 shr $0x10,%ebx
400fb4: 0f b6 cb movzbl %bl,%ecx
400fb7: 85 c9 test %ecx,%ecx
400fb9: 74 b5 je 400f70 <init_cacheinfo+0x1e0>
400fbb: eb a8 jmp 400f65 <init_cacheinfo+0x1d5>
400fbd: c1 e8 0e shr $0xe,%eax
400fc0: 25 ff 03 00 00 and $0x3ff,%eax
400fc5: 89 c6 mov %eax,%esi
400fc7: 74 45 je 40100e <init_cacheinfo+0x27e>
400fc9: 41 83 fc 0a cmp $0xa,%r12d
400fcd: 7e 3f jle 40100e <init_cacheinfo+0x27e>
400fcf: 31 d2 xor %edx,%edx
400fd1: 41 b8 0b 00 00 00 mov $0xb,%r8d
400fd7: 8d 7a 01 lea 0x1(%rdx),%edi
400fda: 44 89 c0 mov %r8d,%eax
400fdd: 89 d1 mov %edx,%ecx
400fdf: 0f a2 cpuid
400fe1: 81 e1 f0 0f 00 00 and $0xff0,%ecx
400fe7: 0f b6 db movzbl %bl,%ebx
400fea: 74 22 je 40100e <init_cacheinfo+0x27e>
400fec: 85 db test %ebx,%ebx
400fee: 74 1e je 40100e <init_cacheinfo+0x27e>
400ff0: 81 f9 00 02 00 00 cmp $0x200,%ecx
400ff6: 89 fa mov %edi,%edx
400ff8: 75 dd jne 400fd7 <init_cacheinfo+0x247>
400ffa: 0f bd f6 bsr %esi,%esi
400ffd: 8d 4e 01 lea 0x1(%rsi),%ecx
401000: 83 c8 ff or $0xffffffff,%eax
401003: 83 eb 01 sub $0x1,%ebx
401006: d3 e0 shl %cl,%eax
401008: 89 c6 mov %eax,%esi
40100a: f7 d6 not %esi
40100c: 21 de and %ebx,%esi
40100e: 83 c6 01 add $0x1,%esi
401011: e9 5d fe ff ff jmpq 400e73 <init_cacheinfo+0xe3>
401016: 40 b7 bf mov $0xbf,%dil
401019: 44 89 e6 mov %r12d,%esi
40101c: e8 af 37 03 00 callq 4347d0 <handle_intel>
401021: bf 02 00 00 00 mov $0x2,%edi
401026: 49 89 c5 mov %rax,%r13
401029: e9 f1 fd ff ff jmpq 400e1f <init_cacheinfo+0x8f>
Hi -
I was able to recreate the bug. I haven't solved it yet, but I have a bunch of leads.
One minor thing: I had to change the CC variable in your Makefile for the executor to build it in my setup. This way should work for everyone:
diff --git a/Makefile b/Makefile
index 4e7e7ddca99b..e58cc4985ed8 100644
--- a/Makefile
+++ b/Makefile
@@ -94,11 +94,8 @@ ifeq ("$(TARGETOS)", "fuchsia")
endif
ifeq ("$(TARGETOS)", "akaros")
- # SOURCEDIR should point to bootstrapped akaros checkout.
# There is no up-to-date Go for akaros, so building Go will fail.
- CC = $(SOURCEDIR)/install/x86_64-ucb-akaros-gcc/bin/x86_64-ucb-akaros-g++
- # Most likely this is incorrect (why doesn't it know own sysroot?), but worked for me.
- ADDCFLAGS = -I $(SOURCEDIR)/tools/compilers/gcc-glibc/x86_64-ucb-akaros-gcc-stage3-builddir/x86_64-ucb-akaros/libstdc++-v3/include/x86_64-ucb-akaros -I $(SOURCEDIR)/tools/compilers/gcc-glibc/x86_64-ucb-akaros-gcc-stage3-builddir/x86_64-ucb-akaros/libstdc++-v3/include -I $(SOURCEDIR)/tools/compilers/gcc-glibc/gcc-4.9.2/libstdc++-v3/libsupc++ -L $(SOURCEDIR)/tools/compilers/gcc-glibc/x86_64-ucb-akaros-gcc-stage3-builddir/x86_64-ucb-akaros/libstdc++-v3/src/.libs
+ CC = $(AKAROS_XCC_ROOT)/bin/x86_64-ucb-akaros-g++
endif
ifeq ("$(TARGETOS)", "windows")
$AKAROS_XCC_ROOT is an environment variable that everyone should use to point to the toolchain installation.
I didn't need the ADDCFLAGS either, though maybe that's a peculiarity of my setup. The only other thing I do is put $(AKAROS_XCC_ROOT)/bin/ in my $PATH, though I don't see why that would help.
Anyway, thanks for the bug - I'll post more when I solve it.
Humm... If I remove ADDCFLAGS, make executor TARGETOS=akaros SOURCEDIR=/src/akaros
fails with:
/src/akaros/install/x86_64-ucb-akaros-gcc/bin/x86_64-ucb-akaros-g++ -o ./bin/akaros_amd64/syz-executor executor/executor_akaros.cc \
-pthread -Wall -Wframe-larger-than=8192 -Wparentheses -Werror -O1 \
-static -DGOOS=\"akaros\" -DGIT_REVISION=\"5b3a76c9f8b55281f244b8f81e48c5b0b935ccc3+\"
In file included from executor/executor_akaros.cc:11:0:
executor/executor.h:4:21: fatal error: algorithm: No such file or directory
#include <algorithm>
I've bootstrapped the toolchain using these commands:
(cd $AKAROS_ROOT && make ARCH=x86 defconfig)
(cd $AKAROS_ROOT && make xcc-upgrade-from-scratch)
How is your setup different? I would definitely like to remove that ADDCFLAGS mess, but I want it to work for me as well :)
Re SOURCEDIR vs AKAROS_XCC_ROOT:
SOURCEDIR is env var name that we use for different things (e.g. also extracting values of constants from OS headers), and it's also a common name across multiple OSes (e.g. also used for linux, fuchsia, freebsd, etc).
E.g. you do make TARGETOS=foo SOURCEDIR=/path/to/foo/checkout
and Makefile and other tools figure out all other locations from SOURCEDIR.
Is AKAROS_ROOT
also a standard env var name (I've seen it in some instructions)? If yes, then it would be reasonable to provide a default value for SOURCEDIR if AKAROS_ROOT is set. E.g. SOURCEDIR ?= AKAROS_ROOT (or what's the syntax for this in Makefiles).
We have two standard env variables. AKAROS_ROOT is the git repo you've downloaded. AKAROS_XCC_ROOT is the location of the toolchain installation - basically everything that gcc/binutils/glibc creates, plus our kernel headers and user libraries. SOURCEDIR sounds like AKAROS_ROOT, though things like kernel headers are also available in AKAROS_XCC_ROOT.
So far, we mostly use AKAROS_XCC_ROOT to find the cross compiler. AKAROS_ROOT is often used for installing cross-compiled binaries into the kernel file system (e.g. $AKAROS_ROOT/kern/kfs/bin, which will end up in /bin).
It seems pretty odd that your cross compiler doesn't know where to look for header files - or at least some of them?
Did you move the toolchain after building it? (That seems unlikely.)
You should have set the env variable X86_64_INSTDIR during toolchain installation too, usually in tools/compilers/gcc-glibc/Makelocal. If you skipped that, the build should have given you an error. But if not, that could mess things up too. (SYSROOT and install locations derive from that).
One option would be to strace -e open,access
your build command, and then we can see where the compiler was looking.
It looks at multiple locations under /src/akaros/install/x86_64-ucb-akaros-gcc/
but I don't have algorithm
file anywhere in that path. I suspect that maybe gcc install failed somewhere in the middle, but installed the binaries at that point, that would explain why I have headers in the build dir, but not in install dir.
I guess I need to try to rebootstrap everything from scratch before we spend more time on this.
Don't you have AKAROS_XCC_ROOT point to "$AKAROS_ROOT/install/x86_64-ucb-akaros-gcc"? If yes, than the current Makefile should work as well (provided that you run it as make TARGETOS=akaros SOURCEDIR=$AKAROS_ROOT
).
That sounds right - you should have those files in the toolchain. e.g. $ find $AKAROS_XCC_ROOT -name algorithm
should have something (like x86_64-ucb-akaros/include/c++/4.9.2/algorithm).
If you look in tools/compilers/gcc-glibc/build_logs/
, you might find what died. If a fresh rebuild doesn't help, then you can email them to me or something and I can look for a problem.
I don't have my AKAROS_XCC_ROOT pointing to a location inside my AKAROS_ROOT. I have it set up like this:
AKAROS_ROOT -> $HOME/akaros/ros-kernel/
AKAROS_XCC_ROOT -> $HOME/ros-gcc-glibc/install-x86_64-ros-gcc/
Actually, since I have the bin directory of XCC_ROOT in my PATH, I can run make executor
just like this:
ifeq ("$(TARGETOS)", "akaros")
CC = x86_64-ucb-akaros-g++
endif
btw, I just tracked down the bug(s). The main one was that under rare conditions (races with page faults on the syz-executor binary), the mm code would free a page that was in the page cache. The page would eventually get reused, which is why syz-executor would go crazy - a chunk of .text was garbage. When the page was reused, various refcnts/flags would be wrong too, which was ultimately responsible for the panic you found. (Short version: it was improperly decreffed, then when we increffed it we had a refcnt of 0, which was the panic).
Anyway, I'll have a patch out later today. With it, the stress tester ran without crashing Akaros.
I am getting the following crashes. Is it a know issue? If not and you don't see why it happens right away, I can try to create a reproducer. Checkout is on 6344ed04e307ba30df879d1d407b10a1b3236784.