openbmc / linux

OpenBMC Linux kernel source tree
Other
49 stars 132 forks source link

Crash and panic on Qemu #104

Closed shenki closed 7 years ago

shenki commented 8 years ago

@legoater's qemu branch v2.7.0-rc3-44-g55a516509b33 Current tip of dev-4.7 v4.7.2-56-g88af72b8ecff

# ifconfig eth0 up
# [   13.990000] ftgmac100 1e660000.ethernet eth0: NCSI interface up

# dhc[   14.990000] ------------[ cut here ]------------
[   14.990000] WARNING: CPU: 0 PID: 118 at net/ipv6/ip6_fib.c:1457 fib6_del+0x70/0x414
[   14.990000] Modules linked in:
[   14.990000] CPU: 0 PID: 118 Comm: kworker/0:1 Not tainted 4.7.2 #2
[   14.990000] Hardware name: ASpeed SoC
[   14.990000] Workqueue: ipv6_addrconf addrconf_dad_work
[   14.990000] [<c01077b4>] (unwind_backtrace) from [<c010539c>] (show_stack+0x10/0x14)
[   14.990000] [<c010539c>] (show_stack) from [<c010f484>] (__warn+0xdc/0xf8)
[   14.990000] [<c010f484>] (__warn) from [<c010f594>] (warn_slowpath_null+0x1c/0x24)
[   14.990000] [<c010f594>] (warn_slowpath_null) from [<c038fd28>] (fib6_del+0x70/0x414)
[   14.990000] [<c038fd28>] (fib6_del) from [<c03901b0>] (fib6_clean_node+0xe4/0x15c)
[   14.990000] [<c03901b0>] (fib6_clean_node) from [<c038e4b4>] (fib6_walk_continue+0xe8/0x164)
[   14.990000] [<c038e4b4>] (fib6_walk_continue) from [<c038eb30>] (fib6_walk+0x4c/0x68)
[   14.990000] [<c038eb30>] (fib6_walk) from [<c038eb9c>] (fib6_prune_clones+0x50/0x78)
[   14.990000] [<c038eb9c>] (fib6_prune_clones) from [<c038f784>] (fib6_add+0x588/0x8d0)
[   14.990000] [<c038f784>] (fib6_add) from [<c03898f4>] (__ip6_ins_rt+0x34/0x48)
[   14.990000] [<c03898f4>] (__ip6_ins_rt) from [<c038cb4c>] (ip6_route_add+0x5c/0xc4)
[   14.990000] [<c038cb4c>] (ip6_route_add) from [<c038d3e4>] (rt6_add_dflt_router+0x6c/0xa0)
[   14.990000] [<c038d3e4>] (rt6_add_dflt_router) from [<c0394800>] (ndisc_rcv+0x7b4/0xe38)
[   14.990000] [<c0394800>] (ndisc_rcv) from [<c039aec0>] (icmpv6_rcv+0x2e8/0x464)
[   14.990000] [<c039aec0>] (icmpv6_rcv) from [<c037f9c8>] (ip6_input+0x1cc/0x520)
[   14.990000] [<c037f9c8>] (ip6_input) from [<c037fe18>] (ip6_mc_input+0xfc/0x118)
[   14.990000] [<c037fe18>] (ip6_mc_input) from [<c02ffd5c>] (__netif_receive_skb_core+0x48c/0x720)
[   14.990000] [<c02ffd5c>] (__netif_receive_skb_core) from [<c03019ec>] (netif_receive_skb_internal+0x84/0xcc)
[   14.990000] [<c03019ec>] (netif_receive_skb_internal) from [<c0302590>] (napi_gro_receive+0x94/0xc4)
[   14.990000] [<c0302590>] (napi_gro_receive) from [<c02ca900>] (ftgmac100_poll+0x464/0x5bc)
[   14.990000] [<c02ca900>] (ftgmac100_poll) from [<c0302a80>] (net_rx_action+0xfc/0x2c0)
[   14.990000] [<c0302a80>] (net_rx_action) from [<c0111fd4>] (__do_softirq+0x184/0x1f0)
[   14.990000] [<c0111fd4>] (__do_softirq) from [<c01120d8>] (do_softirq+0x44/0x54)
[   14.990000] [<c01120d8>] (do_softirq) from [<c0112184>] (__local_bh_enable_ip+0x9c/0xc8)
[   14.990000] [<c0112184>] (__local_bh_enable_ip) from [<c037af14>] (ip6_finish_output2+0x4a0/0x5a0)
[   14.990000] [<c037af14>] (ip6_finish_output2) from [<c0392a48>] (ndisc_send_skb+0x2e4/0x3c0)
[   14.990000] [<c0392a48>] (ndisc_send_skb) from [<c03857bc>] (addrconf_dad_completed+0xf8/0x1c4)
[   14.990000] [<c03857bc>] (addrconf_dad_completed) from [<c0385974>] (addrconf_dad_work+0xec/0x2e0)
[   14.990000] [<c0385974>] (addrconf_dad_work) from [<c0122a50>] (process_one_work+0x228/0x404)
[   14.990000] [<c0122a50>] (process_one_work) from [<c012385c>] (worker_thread+0x290/0x3e0)
[   14.990000] [<c012385c>] (worker_thread) from [<c0127d34>] (kthread+0xd0/0xe4)
[   14.990000] [<c0127d34>] (kthread) from [<c01024b0>] (ret_from_fork+0x14/0x24)
[   14.990000] ---[ end trace c37b130bbf095d2e ]---
p[   43.920000] Unable to handle kernel paging request at virtual address 12005456
[   43.920000] pgd = c0004000
[   43.920000] [12005456] *pgd=00000000
[   43.920000] Internal error: Oops: 5 [#1] ARM
[   43.920000] Modules linked in:
[   43.920000] CPU: 0 PID: 0 Comm: swapper Tainted: G        W       4.7.2 #2
[   43.920000] Hardware name: ASpeed SoC
[   43.920000] task: c0904700 ti: c0900000 task.ti: c0900000
[   43.920000] PC is at fib6_walk_continue+0x74/0x164
[   43.920000] LR is at fib6_clean_node+0xcc/0x15c
[   43.920000] pc : [<c038e440>]    lr : [<c0390198>]    psr: 80000113
[   43.920000] sp : c0901d90  ip : 80200001  fp : cedba280
[   43.920000] r10: c03900cc  r9 : c038e548  r8 : 00000000
[   43.920000] r7 : c0901e34  r6 : 00000633  r5 : 00000002  r4 : c0901dc8
[   43.920000] r3 : 12005452  r2 : 00000000  r1 : cedbcd60  r0 : 00000000
[   43.920000] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   43.920000] Control: 00093177  Table: 4ee50000  DAC: 00000051
[   43.920000] Process swapper (pid: 0, stack limit = 0xc0900190)
[   43.920000] Stack: (0xc0901d90 to 0xc0902000)
[   43.920000] 1d80:                                     c0901dc8 c091aa08 c091ae28 c038eb30
[   43.920000] 1da0: 00000000 c0902028 c091aa08 c038eec4 00000000 00000000 c038ee20 c0135e1c
[   43.920000] 1dc0: c0904a88 cedba28c c091add4 c091add4 cedba2a8 12005452 00000000 00000000
[   43.920000] 1de0: c0904a00 00000000 00000001 c03900cc c091ef50 c091aa08 c038e548 00000000
[   43.920000] 1e00: c0901e34 dc8ba502 c038ca98 c091aa08 00000bb8 c0902028 c091ada0 c091ae38
[   43.920000] 1e20: c090c2a0 c091aa08 c091ada0 c0390290 00000000 00000bb8 00000000 dc8ba502
[   43.920000] 1e40: 00000000 c0901e74 ffffe000 c0902028 00000100 00000001 c0390318 c0146bec
[   43.920000] 1e60: 00000001 00000000 c0146b78 60000193 c0904a88 c0ea9254 c0a68c10 00000000
[   43.920000] 1e80: c04d2448 dc8ba502 c090ba60 c090c2a0 c090be9c 00000000 c090ba60 c0390318
[   43.920000] 1ea0: c091aa08 c0147394 00000001 c0902028 00000000 dc8ba502 00000000 c0900000
[   43.920000] 1ec0: 40000001 00000002 c0925980 00200000 0000000a c09259a4 00000100 c0111fd4
[   43.920000] 1ee0: c0904700 cf807100 cf80715c ffff9bf9 00000002 c090c2a0 c09259a0 00000001
[   43.920000] 1f00: c0912de4 00000000 c0912de4 00000000 cf802200 00000001 cfffcea0 c061ea48
[   43.920000] 1f20: 00000000 c011229c 00000000 c013dd98 cf805020 00000001 c0901f60 00000000
[   43.920000] 1f40: 00000020 c01014a8 c0102d0c 60000013 ffffffff c0901f94 c0902020 c0105e10
[   43.920000] 1f60: 00000000 00093177 00092177 60000013 c0900000 c0925260 ffffffff c0902028
[   43.920000] 1f80: c0902020 cfffcea0 c061ea48 00000000 600000d3 c0901fb0 c0102d14 c0102d0c
[   43.920000] 1fa0: 60000013 ffffffff 00000053 c01564e0 c0900000 c0133208 c09252ac c0600c54
[   43.920000] 1fc0: ffffffff ffffffff 00000000 c06006a4 c061ea48 00000000 c09253d4 c090203c
[   43.920000] 1fe0: c061ea44 c0905d2c 40004000 41069265 4061cbac 40008048 00000000 00000000
[   43.920000] [<c038e440>] (fib6_walk_continue) from [<c038eb30>] (fib6_walk+0x4c/0x68)
[   43.920000] [<c038eb30>] (fib6_walk) from [<c038eec4>] (__fib6_clean_all+0xa4/0xfc)
[   43.920000] [<c038eec4>] (__fib6_clean_all) from [<c0390290>] (fib6_run_gc+0x5c/0xe4)
[   43.920000] [<c0390290>] (fib6_run_gc) from [<c0146bec>] (call_timer_fn+0x74/0x118)
[   43.920000] [<c0146bec>] (call_timer_fn) from [<c0147394>] (run_timer_softirq+0x1cc/0x1f4)
[   43.920000] [<c0147394>] (run_timer_softirq) from [<c0111fd4>] (__do_softirq+0x184/0x1f0)
[   43.920000] [<c0111fd4>] (__do_softirq) from [<c011229c>] (irq_exit+0x84/0xe8)
[   43.920000] [<c011229c>] (irq_exit) from [<c013dd98>] (__handle_domain_irq+0x74/0xa0)
[   43.920000] [<c013dd98>] (__handle_domain_irq) from [<c01014a8>] (avic_handle_irq+0x60/0x70)
[   43.920000] [<c01014a8>] (avic_handle_irq) from [<c0105e10>] (__irq_svc+0x50/0x64)
[   43.920000] Exception stack(0xc0901f60 to 0xc0901fa8)
[   43.920000] 1f60: 00000000 00093177 00092177 60000013 c0900000 c0925260 ffffffff c0902028
[   43.920000] 1f80: c0902020 cfffcea0 c061ea48 00000000 600000d3 c0901fb0 c0102d14 c0102d0c
[   43.920000] 1fa0: 60000013 ffffffff
[   43.920000] [<c0105e10>] (__irq_svc) from [<c0102d0c>] (arch_cpu_idle+0x24/0x34)
[   43.920000] [<c0102d0c>] (arch_cpu_idle) from [<c0133208>] (cpu_startup_entry+0xa0/0xe0)
[   43.920000] [<c0133208>] (cpu_startup_entry) from [<c0600c54>] (start_kernel+0x35c/0x3e8)
[   43.920000] Code: c038e440 c038e464 c038e47c c038e4d4 (e5932004) 
[   43.920000] ---[ end trace c37b130bbf095d2f ]---
[   43.920000] Kernel panic - not syncing: Fatal exception in interrupt
[   43.920000] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
amboar commented 8 years ago

@shenki probably related to #101, #102 and #103?

legoater commented 8 years ago

I suppose #104 and #101 are the same. #102 an #103 seem different.

In net/ipv6/ip6_fib.c, fib6_walk_continue starts with this comment :

 * Certainly, it is not interrupt safe

so, how are you starting the guest ? I suppose with :

-net nic -net user,hostfwd=:127.0.0.1:2222-:22,hostname=qemu

as I see :

ftgmac100 1e660000.ethernet eth0: NCSI interface up

which means that the fake NCSI backend is active.

Are you booting with uboot or directly from qemu with -kernel with a custom kernel ?

amboar commented 8 years ago

@legoater I was booting using -kernel and a custom kernel for the issues I reported.

shenki commented 8 years ago
/arm-softmmu/qemu-system-arm -m 256M -M palmetto-bmc -nographic -nodefaults -net nic -net user,hostfwd=:127.0.0.1:2222-:22,hostname=qemu -serial stdio -drive file=/srv/tftp/flash-palmetto,format=raw,if=mtd -drive file=/srv/tftp/palmetto.pnor,format=raw,if=mtd -kernel ~/dev/kernels/aspeed/arch/arm/boot/uImage
legoater commented 8 years ago

so I reproduced once after 20/25 reboots. This is going to go be complex to track. Do you have some kind of scenario ?

legoater commented 7 years ago

moved to openbmc/qemu openbmc/qemu#11