dtrace4linux / linux

dtrace for linux - kernel driver and userland tools
http://crtags.blogspot.com
1.17k stars 226 forks source link

syscall::fork:entry has a NULL pointer dereference #75

Open ryao opened 10 years ago

ryao commented 10 years ago

The following is from a QEMU virtual machine where I tried doing a fork in another virtual termianl. list *systrace_assembler_dummy+0xb/0x1c shows that /root/dtrace/build-3.13.0/driver/systrace.c:450 is the last thing on the stack. It looks like we have a NULL pointer dereference in systrace_part1_sys_clone().

# dtrace -n 'syscall:::entry { @num[execname] = count(); }'
dtrace: description 'syscall:::entry ' matched 661 probes
[   66.394407] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   66.397102] IP: [<          (null)>]           (null)
[   66.398791] PGD 3b19a067 PUD 3b1c1067 PMD 0 
[   66.400036] Oops: 0010 [#1] SMP ,execname] = quantize(timestamp - self->ts); s                                                                                                                                                                                              
[   66.400036] Modules linked in: dtracedrv(PO) zfs(PO) zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O)
[   66.400036] CPU: 0 PID: 2060 Comm: bash Tainted: P           O 3.13.0 #8
[   66.400036] Hardware name: Apple Inc. iMac8,1/Mac-F227BEC8, BIOS IM81.88Z.00C1.B00.0802091538 01/01/2011
[   66.400036] task: ffff88003d815da0 ti: ffff88003cd54000 task.ti: ffff88003cd54000
[   66.400036] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
[   66.400036] RSP: 0018:ffff88003cd55f48  EFLAGS: 00010286
[   66.400036] RAX: 0000000000000038 RBX: 00007fffba3140a0 RCX: 00007fa8b65139d0
[   66.400036] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[   66.400036] RBP: 00007fffba3140e0 R08: 0000000000000000 R09: 00007fa8b6513700
[   66.400036] R10: 00007fa8b65139d0 R11: 0000000000000246 R12: 000000000000080c
[   66.400036] R13: 0000000000000000 R14: 0000000001a6ccc0 R15: 0000000001a7b5a0
[   66.400036] FS:  00007fa8b6513700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[   66.400036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.400036] CR2: 0000000000000000 CR3: 000000003c01c000 CR4: 00000000000407b0
[   66.400036] Stack:
[   66.400036]  ffffffffa01d579b ffffffff8149a248 00000000006b200c 00007fa8b5c54698
[   66.400036]  ffffc900001e9417 0000000000002710 0000000000000001 ffffffff8149a77d
[   66.400036]  0000000000000246 00007fa8b65139d0 00007fa8b6513700 0000000000000000
[   66.400036] Call Trace:
[   66.400036]  [<ffffffffa01d579b>] ? systrace_assembler_dummy+0xb/0x1c [dtracedrv]
[   66.400036]  [<ffffffff8149a248>] ? page_fault+0x28/0x30
[   66.400036]  [<ffffffff8149a77d>] ? system_call_fastpath+0x1a/0x1f
[   66.400036] Code:  Bad RIP value.
[   66.400036] RIP  [<          (null)>]           (null)
[   66.400036]  RSP <ffff88003cd55f48>
[   66.400036] CR2: 0000000000000000
[   66.456421] ---[ end trace c2c2c96bf00196a8 ]---
[   66.473307] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   66.476065] IP: [<          (null)>]           (null)
[   66.477786] PGD 3d89d067 PUD 3d690067 PMD 0 
[   66.479311] Oops: 0010 [#2] SMP 
[   66.480010] Modules linked in: dtracedrv(PO) zfs(PO) zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O)
[   66.480010] CPU: 0 PID: 1 Comm: init Tainted: P      D    O 3.13.0 #8
[   66.480010] Hardware name: Apple Inc. iMac8,1/Mac-F227BEC8, BIOS IM81.88Z.00C1.B00.0802091538 01/01/2011
[   66.480010] task: ffff88003e0a0000 ti: ffff88003e03e000 task.ti: ffff88003e03e000
[   66.480010] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
[   66.480010] RSP: 0018:ffff88003e03ff48  EFLAGS: 00010286
[   66.480010] RAX: 0000000000000038 RBX: 00007fff49cb8b10 RCX: 00007f55a39fa9d0
[   66.480010] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[   66.480010] RBP: 00007fff49cb8b50 R08: 0000000000000000 R09: 00007f55a39fa700
[   66.480010] R10: 00007f55a39fa9d0 R11: 0000000000000246 R12: 0000000000000001
[   66.480010] R13: 0000000000000001 R14: 0000000000407657 R15: ffffffffffffffff
[   66.480010] FS:  00007f55a39fa700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[   66.480010] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.480010] CR2: 0000000000000000 CR3: 000000003d680000 CR4: 00000000000407b0
[   66.480010] Stack:
[   66.480010]  ffffffffa01d579b ffffffff8149aa3a 0000000000000000 000000000000080a
[   66.480010]  ffffc900001e92a7 00007fff49cb8cf1 000000000000006e ffffffff8149a77d
[   66.480010]  0000000000000246 00007f55a39fa9d0 00007f55a39fa700 0000000000000000
[   66.480010] Call Trace:
[   66.480010]  [<ffffffffa01d579b>] ? systrace_assembler_dummy+0xb/0x1c [dtracedrv]
[   66.480010]  [<ffffffff8149aa3a>] ? int_signal+0x12/0x17
[   66.480010]  [<ffffffff8149a77d>] ? system_call_fastpath+0x1a/0x1f
[   66.480010] Code:  Bad RIP value.
[   66.480010] RIP  [<          (null)>]           (null)
[   66.480010]  RSP <ffff88003e03ff48>
[   66.480010] CR2: 0000000000000000
[   66.530963] ---[ end trace c2c2c96bf00196a9 ]---
[   66.533284] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
ryao commented 10 years ago

It looks like save_rest_ptr is NULL. SAVE_REST is a macro in arch/x86/include/asm/calling.h, but the code tries to call it as a function. The comments suggest that this worked, but it is not apparent how.

dtrace4linux commented 10 years ago

What kernel are you using? The code is brittle but works on older kernels. On 29 Jul 2014 00:59, "Richard Yao" notifications@github.com wrote:

It looks like save_rest_ptr is NULL. After looking at this in more detail, I find it a wonder that it ever worked. SAVE_REST is a macro in arch/x86/include/asm/calling.h. It shouldn't be accessible in the manner that it is used here.

— Reply to this email directly or view it on GitHub https://github.com/dtrace4linux/linux/issues/75#issuecomment-50418623.

ryao commented 10 years ago

The original report used 3.13.0. My test VM is currently using 3.12.21-gentoo-r1, which is a very lightly patched 3.12.21. The syscall code was definitely not touched. The latter kernel was compiled with GCC 4.8.2. It is worth noting that the Gentoo version applies to the kernel sources while the .config is provided by the user. Here is a link to a pastebin containing my VM's .config just in case you asked expecting to be able to find it based on my answer to your question:

http://bpaste.net/show/521394/

On a related note, Linux 3.7 changed several syscalls to stop using ptregs. torvalds/linux@6bf9adfc90370b695cb111116e15fdc0e1906270 changed sys_sigaltstack while torvalds/linux@1d4b4b2994b5fc208963c0b795291f8c1f18becf changed sys_fork/sys_vfork/sys_clone. This code is definitely broken as of Linux 3.7. It is not clear to me that it should work as expected in earlier kernels because SAVE_REST is not a function, but I believe that it somehow did not crash.

ryao commented 10 years ago

sys_sigsuspend seems to have also been changed in Linux 3.7:

torvalds/linux@0a0e8cdf734ce723bfc4ca6032ffbc03ce17c642

dtrace4linux commented 10 years ago

Interesting - I thought I had verified up to the 3.11 kernel but I dont have a VM containing that - so now I have something to keep me busy for a few days. It looks like that if pt_regs stuff is gone, that should make life simpler. The code in my dtrace is horribly complex, and needs simplification. It could be time to split the file and freeze the legacy kernels and make life easier for everything else.

Strangely, it compiles nicely, but as you say, with save_rest having gone AWOL, it will just panic if you invoke it. I will go for 3.13 kernel (current ubuntu) to reset the baseline.

thanks

On 29 July 2014 14:18, Richard Yao notifications@github.com wrote:

sys_sigsuspend seems to have also been changed in Linux 3.7:

torvalds/linux@0a0e8cd https://github.com/torvalds/linux/commit/0a0e8cdf734ce723bfc4ca6032ffbc03ce17c642

— Reply to this email directly or view it on GitHub https://github.com/dtrace4linux/linux/issues/75#issuecomment-50474417.

dtrace4linux commented 10 years ago

I now have a 3.13 kernel in a VM and can see that dtrace works, except for the ptregs syscalls. Hopefully not too difficult to fix in the next few days.

On 29 July 2014 22:25, Paul Fox paul.d.fox@gmail.com wrote:

Interesting - I thought I had verified up to the 3.11 kernel but I dont have a VM containing that - so now I have something to keep me busy for a few days. It looks like that if pt_regs stuff is gone, that should make life simpler. The code in my dtrace is horribly complex, and needs simplification. It could be time to split the file and freeze the legacy kernels and make life easier for everything else.

Strangely, it compiles nicely, but as you say, with save_rest having gone AWOL, it will just panic if you invoke it. I will go for 3.13 kernel (current ubuntu) to reset the baseline.

thanks

On 29 July 2014 14:18, Richard Yao notifications@github.com wrote:

sys_sigsuspend seems to have also been changed in Linux 3.7:

torvalds/linux@0a0e8cd https://github.com/torvalds/linux/commit/0a0e8cdf734ce723bfc4ca6032ffbc03ce17c642

— Reply to this email directly or view it on GitHub https://github.com/dtrace4linux/linux/issues/75#issuecomment-50474417.