ClangBuiltLinux / continuous-integration

Continuous integration of latest Linux kernel with daily build of Clang & LLVM tools
https://travis-ci.com/ClangBuiltLinux/continuous-integration
Apache License 2.0
44 stars 18 forks source link

bash[1] trap invalid opcode #120

Closed shenki closed 5 years ago

shenki commented 5 years ago

I used the container and driver.sh to build locally on my amd64 laptop, and then ran qemu manually with init=/bin/bash added to the command line. Unexpectedly it crashed:

qemu-system-x86_64 -m 512m -drive file=images/x86_64/rootfs.ext4,format=raw,if=ide \
 -append "console=ttyS0 root=/dev/sda init=/bin/bash" -nographic \
 -kernel linux/arch/x86_64/boot/bzImage
[   13.093363] Run /bin/bash as init process
[   13.257633] traps: bash[1] trap invalid opcode ip:47ed19 sp:7ffcf94e9680 error:0 in bash[400000+d0000]
[   13.266388] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[   13.266761] CPU: 0 PID: 1 Comm: bash Not tainted 5.0.0-rc4+ #1
[   13.266905] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   13.267230] Call Trace:
[   13.267510]  dump_stack+0xa7/0x10b
[   13.267672]  panic+0x104/0x2ed
[   13.267843]  ? do_exit+0x888/0x900
[   13.267998]  do_exit+0x900/0x900
[   13.268102]  do_group_exit+0xa0/0xc0
[   13.268218]  get_signal+0x638/0x8a0
[   13.268348]  ? rcu_read_lock_sched_held+0x86/0xc0
[   13.268490]  ? invalid_op+0xa/0x20
[   13.268603]  do_signal+0x2f/0x590
[   13.268710]  ? force_sig_info+0x153/0x160
[   13.268841]  ? invalid_op+0xa/0x20
[   13.268950]  prepare_exit_to_usermode+0xc7/0x160
[   13.269099]  retint_user+0x8/0x18
[   13.269318] RIP: 0033:0x47ed19
[   13.269506] Code: Bad RIP value.
[   13.269603] RSP: 002b:00007ffcf94e9680 EFLAGS: 00000202
[   13.269757] RAX: 00000000004a8277 RBX: 0000000000000001 RCX: 0000000000000000
[   13.269917] RDX: 00000000004e16e8 RSI: 0000000000000000 RDI: 00000000004e1b10
[   13.270070] RBP: 00000000004a1d11 R08: 0000000000000000 R09: 0000000000000003
[   13.270250] R10: 0000000007ab8a40 R11: 00007feba9f2ccd1 R12: 0000000000000001
[   13.270419] R13: 00000000ffffffff R14: 0000000000000000 R15: 00007ffcf94e99d8
[   13.271223] Kernel Offset: 0x25000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   13.271765] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 ]---
shenki commented 5 years ago

Using /bin/sh (which is busybox) works fine. Running bash from the shell crashes:

/ # bash 
[   31.910485] traps: bash[1186] trap invalid opcode ip:47ed19 sp:7ffcf2da20b0 error:0 in bash[400000+d0000]
[   31.916172] bash (1186) used greatest stack depth: 12432 bytes left
Illegal instruction

Perhaps we should do a re-build of the rootfs?

nathanchance commented 5 years ago

Whoa... Don't know why I turned on bash. I wonder whose bug this is (buildroot, QEMU, the kernel)? I don't think there's any point to using a different shell other than the busybox one since we're almost never going to be in it.

shenki commented 5 years ago

There I was assuming bash was in there for something special. Lets remove bash and we can close this one.

shenki commented 5 years ago

Although, we should try executing a non-trivial program under clang. I will build a gcc kernel and see if the same thing happens.

nickdesaulniers commented 5 years ago

Userspace should never be able to panic the kernel. On Android, that's considered a DoS.

shenki commented 5 years ago

We only panic as it's PID 1. When running it from a shell we get the "usual" SIGILL behaviour.

nickdesaulniers commented 5 years ago

bash is being run as pid 1? isn't init or w/e supposed to be pid 1?

nathanchance commented 5 years ago

Well isn't that what init= means?

https://github.com/ClangBuiltLinux/linux/blob/b23b0ea3708c3dec599966fc856836aca48835b9/Documentation/admin-guide/kernel-parameters.txt#L1643-L1646

nickdesaulniers commented 5 years ago

How would bash contain an invalid opcode? Unless it was compiled with extensions like avx and the qemu machine variant you're running doesn't support those ISA extensions? Or maybe it jumped to the middle of the instruction stream.

On Tue, Jan 29, 2019, 12:28 AM Nathan Chancellor <notifications@github.com wrote:

Well isn't that what init= means?

https://github.com/ClangBuiltLinux/linux/blob/b23b0ea3708c3dec599966fc856836aca48835b9/Documentation/admin-guide/kernel-parameters.txt#L1643-L1646

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ClangBuiltLinux/continuous-integration/issues/120#issuecomment-458450533, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvUX6IJl5_mLqXpg50j8H8EGCkmVzzMks5vIAYqgaJpZM4aXX5N .

nathanchance commented 5 years ago

Unless it was compiled with extensions like avx and the qemu machine variant you're running doesn't support those ISA extensions?

Exactly what happened. When I initially created these images for myself, I selected BR2_x86_corei7=y (as that's what I was using at the time), which selects a bunch of ISA extensions: https://github.com/buildroot/buildroot/blob/master/arch/Config.in.x86

config BR2_x86_corei7
    bool "corei7"
    select BR2_X86_CPU_HAS_MMX
    select BR2_X86_CPU_HAS_SSE
    select BR2_X86_CPU_HAS_SSE2
    select BR2_X86_CPU_HAS_SSE3
    select BR2_X86_CPU_HAS_SSSE3
    select BR2_X86_CPU_HAS_SSE4
    select BR2_X86_CPU_HAS_SSE42

I have never noticed a problem because I always use the -cpu host -enable-kvm options, which makes QEMU's CPU emulate the host's. QEMU's default cpu qemu64 only supports SSE3, which Buildroot's default target variant nocona supports. I just rebuilt the images with that variant and now Joel's command succeeds, with and without -cpu host -enable-kvm. I'm running it through CI then I'll pull request: https://github.com/nathanchance/continuous-integration/commit/3c30d0f30865e56928a722a5fceed021459c87bd