sfjro / aufs-standalone

27 stars 14 forks source link

[crash] Unable to handle kernel paging request at virtual address #25

Open ccoager opened 1 year ago

ccoager commented 1 year ago

I'm seeing kernel crashes on 5.15, 6.0 and 6.1. I'm using Raspberry Pi's on NFS netboot with AUFS storage on Crio/Kubernetes.

[Thu Dec  8 19:12:39 2022] Unable to handle kernel paging request at virtual address 0001010201019004
[Thu Dec  8 19:12:39 2022] Mem abort info:
[Thu Dec  8 19:12:39 2022]   ESR = 0x0000000096000004
[Thu Dec  8 19:12:39 2022]   EC = 0x25: DABT (current EL), IL = 32 bits
[Thu Dec  8 19:12:39 2022]   SET = 0, FnV = 0
[Thu Dec  8 19:12:39 2022]   EA = 0, S1PTW = 0
[Thu Dec  8 19:12:39 2022]   FSC = 0x04: level 0 translation fault
[Thu Dec  8 19:12:39 2022] Data abort info:
[Thu Dec  8 19:12:39 2022]   ISV = 0, ISS = 0x00000004
[Thu Dec  8 19:12:39 2022]   CM = 0, WnR = 0
[Thu Dec  8 19:12:39 2022] [0001010201019004] address between user and kernel address ranges
[Thu Dec  8 19:12:39 2022] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[Tue Jan 17 10:00:53 2023] rcu: INFO: rcu_preempt self-detected stall on CPU
[Tue Jan 17 10:00:53 2023] rcu:         2-....: (5249 ticks this GP) idle=dbfc/1/0x4000000000000000 softirq=20995/20996 fqs=2624
[Tue Jan 17 10:00:54 2023]      (t=5250 jiffies g=33425 q=23090 ncpus=4)
[Tue Jan 17 10:00:54 2023] CPU: 2 PID: 39 Comm: kworker/u8:1 Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:00:54 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 17 10:00:54 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:00:54 2023] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 17 10:00:54 2023] pc : queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:00:54 2023] lr : _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:00:54 2023] sp : ffffffc00857b8a0
[Tue Jan 17 10:00:54 2023] x29: ffffffc00857b8a0 x28: ffffff8100b0e460 x27: 0000000000000400
[Tue Jan 17 10:00:54 2023] x26: ffffff816df5f778 x25: 0000000000000000 x24: 000000007fffffff
[Tue Jan 17 10:00:54 2023] x23: ffffff816df5f660 x22: ffffff816df5f664 x21: 0000000000000002
[Tue Jan 17 10:00:54 2023] x20: ffffff81003b9ec0 x19: ffffff816df5f660 x18: 0000000000000000
[Tue Jan 17 10:00:54 2023] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[Tue Jan 17 10:00:54 2023] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[Tue Jan 17 10:00:54 2023] x11: 00000000000000ef x10: 0000000000001a60 x9 : ffffffe00c3d83c4
[Tue Jan 17 10:00:54 2023] x8 : ffffffc00857b758 x7 : 0000000000000000 x6 : 000000205c04e000
[Tue Jan 17 10:00:54 2023] x5 : ffffffe00cbb9000 x4 : ffffffe00cbb90f0 x3 : 0000000000000006
[Tue Jan 17 10:00:54 2023] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000106
[Tue Jan 17 10:00:54 2023] Call trace:
[Tue Jan 17 10:00:54 2023]  queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:00:54 2023]  _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:00:54 2023]  __mutex_lock.constprop.0+0xa8/0x604
[Tue Jan 17 10:00:54 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 17 10:00:54 2023]  mutex_lock+0x60/0x70
[Tue Jan 17 10:00:54 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 17 10:00:54 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 17 10:00:54 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 17 10:00:54 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 17 10:00:54 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 17 10:00:54 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 17 10:00:54 2023]  __writeback_inodes_wb+0x58/0x110
[Tue Jan 17 10:00:54 2023]  wb_writeback+0x2d4/0x3b0
[Tue Jan 17 10:00:54 2023]  wb_workfn+0x364/0x5b4
[Tue Jan 17 10:00:54 2023]  process_one_work+0x1dc/0x450
[Tue Jan 17 10:00:54 2023]  worker_thread+0x154/0x450
[Tue Jan 17 10:00:54 2023]  kthread+0x104/0x110
[Tue Jan 17 10:00:54 2023]  ret_from_fork+0x10/0x20
[Tue Jan 17 10:00:58 2023] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 5595 jiffies s: 725 root: 0x4/.
[Tue Jan 17 10:00:58 2023] rcu: blocking rcu_node structures (internal RCU debug):
[Tue Jan 17 10:00:58 2023] Task dump for CPU 2:
[Tue Jan 17 10:00:58 2023] task:kworker/u8:1    state:R  running task     stack:0     pid:39    ppid:2      flags:0x0000000a
[Tue Jan 17 10:00:58 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:00:58 2023] Call trace:
[Tue Jan 17 10:00:58 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:00:58 2023]  0xffffffc00857b8a0
[Tue Jan 17 10:01:56 2023] rcu: INFO: rcu_preempt self-detected stall on CPU
[Tue Jan 17 10:01:57 2023] rcu:         2-....: (20999 ticks this GP) idle=dbfc/1/0x4000000000000000 softirq=20995/20996 fqs=10494
[Tue Jan 17 10:01:57 2023]      (t=21003 jiffies g=33425 q=136963 ncpus=4)
[Tue Jan 17 10:01:57 2023] CPU: 2 PID: 39 Comm: kworker/u8:1 Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:01:57 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 17 10:01:57 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:01:57 2023] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 17 10:01:57 2023] pc : queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:01:57 2023] lr : _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:01:57 2023] sp : ffffffc00857b8a0
[Tue Jan 17 10:01:57 2023] x29: ffffffc00857b8a0 x28: ffffff8100b0e460 x27: 0000000000000400
[Tue Jan 17 10:01:57 2023] x26: ffffff816df5f778 x25: 0000000000000000 x24: 000000007fffffff
[Tue Jan 17 10:01:57 2023] x23: ffffff816df5f660 x22: ffffff816df5f664 x21: 0000000000000002
[Tue Jan 17 10:01:57 2023] x20: ffffff81003b9ec0 x19: ffffff816df5f660 x18: 0000000000000000
[Tue Jan 17 10:01:57 2023] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[Tue Jan 17 10:01:57 2023] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[Tue Jan 17 10:01:57 2023] x11: 00000000000000ef x10: 0000000000001a60 x9 : ffffffe00c3d83c4
[Tue Jan 17 10:01:57 2023] x8 : ffffffc00857b758 x7 : 0000000000000000 x6 : 000000205c04e000
[Tue Jan 17 10:01:57 2023] x5 : ffffffe00cbb9000 x4 : ffffffe00cbb90f0 x3 : 0000000000000006
[Tue Jan 17 10:01:57 2023] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000106
[Tue Jan 17 10:01:57 2023] Call trace:
[Tue Jan 17 10:01:57 2023]  queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:01:57 2023]  _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:01:57 2023]  __mutex_lock.constprop.0+0xa8/0x604
[Tue Jan 17 10:01:57 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 17 10:01:57 2023]  mutex_lock+0x60/0x70
[Tue Jan 17 10:01:57 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 17 10:01:57 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 17 10:01:57 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 17 10:01:57 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 17 10:01:57 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 17 10:01:57 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 17 10:01:57 2023]  __writeback_inodes_wb+0x58/0x110
[Tue Jan 17 10:01:57 2023]  wb_writeback+0x2d4/0x3b0
[Tue Jan 17 10:01:57 2023]  wb_workfn+0x364/0x5b4
[Tue Jan 17 10:01:57 2023]  process_one_work+0x1dc/0x450
[Tue Jan 17 10:01:57 2023]  worker_thread+0x154/0x450
[Tue Jan 17 10:01:57 2023]  kthread+0x104/0x110
[Tue Jan 17 10:01:57 2023]  ret_from_fork+0x10/0x20
[Tue Jan 17 10:02:02 2023] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 21467 jiffies s: 725 root: 0x4/.
[Tue Jan 17 10:02:02 2023] rcu: blocking rcu_node structures (internal RCU debug):
[Tue Jan 17 10:02:02 2023] Task dump for CPU 2:
[Tue Jan 17 10:02:02 2023] task:kworker/u8:1    state:R  running task     stack:0     pid:39    ppid:2      flags:0x0000000a
[Tue Jan 17 10:02:02 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:02:02 2023] Call trace:
[Tue Jan 17 10:02:02 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:02:02 2023]  0xffffffc00857b8a0
[Tue Jan 17 10:03:00 2023] rcu: INFO: rcu_preempt self-detected stall on CPU
[Tue Jan 17 10:03:00 2023] rcu:         2-....: (36749 ticks this GP) idle=dbfc/1/0x4000000000000000 softirq=20995/20996 fqs=18364
[Tue Jan 17 10:03:00 2023]      (t=36756 jiffies g=33425 q=247698 ncpus=4)
[Tue Jan 17 10:03:00 2023] CPU: 2 PID: 39 Comm: kworker/u8:1 Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:03:00 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 17 10:03:00 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:03:00 2023] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 17 10:03:00 2023] pc : queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:03:00 2023] lr : _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:03:00 2023] sp : ffffffc00857b8a0
[Tue Jan 17 10:03:00 2023] x29: ffffffc00857b8a0 x28: ffffff8100b0e460 x27: 0000000000000400
[Tue Jan 17 10:03:00 2023] x26: ffffff816df5f778 x25: 0000000000000000 x24: 000000007fffffff
[Tue Jan 17 10:03:00 2023] x23: ffffff816df5f660 x22: ffffff816df5f664 x21: 0000000000000002
[Tue Jan 17 10:03:00 2023] x20: ffffff81003b9ec0 x19: ffffff816df5f660 x18: 0000000000000000
[Tue Jan 17 10:03:00 2023] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[Tue Jan 17 10:03:00 2023] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[Tue Jan 17 10:03:00 2023] x11: 00000000000000ef x10: 0000000000001a60 x9 : ffffffe00c3d83c4
[Tue Jan 17 10:03:00 2023] x8 : ffffffc00857b758 x7 : 0000000000000000 x6 : 000000205c04e000
[Tue Jan 17 10:03:00 2023] x5 : ffffffe00cbb9000 x4 : ffffffe00cbb90f0 x3 : 0000000000000006
[Tue Jan 17 10:03:00 2023] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000106
[Tue Jan 17 10:03:00 2023] Call trace:
[Tue Jan 17 10:03:00 2023]  queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:03:00 2023]  _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:03:00 2023]  __mutex_lock.constprop.0+0xa8/0x604
[Tue Jan 17 10:03:00 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 17 10:03:00 2023]  mutex_lock+0x60/0x70
[Tue Jan 17 10:03:00 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 17 10:03:00 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 17 10:03:00 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 17 10:03:00 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 17 10:03:00 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 17 10:03:00 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 17 10:03:00 2023]  __writeback_inodes_wb+0x58/0x110
[Tue Jan 17 10:03:00 2023]  wb_writeback+0x2d4/0x3b0
[Tue Jan 17 10:03:00 2023]  wb_workfn+0x364/0x5b4
[Tue Jan 17 10:03:00 2023]  process_one_work+0x1dc/0x450
[Tue Jan 17 10:03:00 2023]  worker_thread+0x154/0x450
[Tue Jan 17 10:03:00 2023]  kthread+0x104/0x110
[Tue Jan 17 10:03:00 2023]  ret_from_fork+0x10/0x20
[Tue Jan 17 10:03:05 2023] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 37339 jiffies s: 725 root: 0x4/.
[Tue Jan 17 10:03:05 2023] rcu: blocking rcu_node structures (internal RCU debug):
[Tue Jan 17 10:03:05 2023] Task dump for CPU 2:
[Tue Jan 17 10:03:05 2023] task:kworker/u8:1    state:R  running task     stack:0     pid:39    ppid:2      flags:0x0000000a
[Tue Jan 17 10:03:05 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:03:05 2023] Call trace:
[Tue Jan 17 10:03:05 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:03:05 2023]  0xffffffc00857b8a0
[Tue Jan 17 10:04:03 2023] rcu: INFO: rcu_preempt self-detected stall on CPU
[Tue Jan 17 10:04:03 2023] rcu:         2-....: (52499 ticks this GP) idle=dbfc/1/0x4000000000000000 softirq=20995/20996 fqs=26237
[Tue Jan 17 10:04:03 2023]      (t=52509 jiffies g=33425 q=361509 ncpus=4)
[Tue Jan 17 10:04:03 2023] CPU: 2 PID: 39 Comm: kworker/u8:1 Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:04:03 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 17 10:04:03 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:04:03 2023] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 17 10:04:03 2023] pc : queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:04:03 2023] lr : _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:04:03 2023] sp : ffffffc00857b8a0
[Tue Jan 17 10:04:03 2023] x29: ffffffc00857b8a0 x28: ffffff8100b0e460 x27: 0000000000000400
[Tue Jan 17 10:04:03 2023] x26: ffffff816df5f778 x25: 0000000000000000 x24: 000000007fffffff
[Tue Jan 17 10:04:03 2023] x23: ffffff816df5f660 x22: ffffff816df5f664 x21: 0000000000000002
[Tue Jan 17 10:04:03 2023] x20: ffffff81003b9ec0 x19: ffffff816df5f660 x18: 0000000000000000
[Tue Jan 17 10:04:03 2023] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[Tue Jan 17 10:04:03 2023] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[Tue Jan 17 10:04:03 2023] x11: 00000000000000ef x10: 0000000000001a60 x9 : ffffffe00c3d83c4
[Tue Jan 17 10:04:03 2023] x8 : ffffffc00857b758 x7 : 0000000000000000 x6 : 000000205c04e000
[Tue Jan 17 10:04:03 2023] x5 : ffffffe00cbb9000 x4 : ffffffe00cbb90f0 x3 : 0000000000000006
[Tue Jan 17 10:04:03 2023] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000106
[Tue Jan 17 10:04:03 2023] Call trace:
[Tue Jan 17 10:04:03 2023]  queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:04:03 2023]  _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:04:03 2023]  __mutex_lock.constprop.0+0xa8/0x604
[Tue Jan 17 10:04:03 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 17 10:04:03 2023]  mutex_lock+0x60/0x70
[Tue Jan 17 10:04:03 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 17 10:04:03 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 17 10:04:03 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 17 10:04:03 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 17 10:04:03 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 17 10:04:03 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 17 10:04:03 2023]  __writeback_inodes_wb+0x58/0x110
[Tue Jan 17 10:04:03 2023]  wb_writeback+0x2d4/0x3b0
[Tue Jan 17 10:04:03 2023]  wb_workfn+0x364/0x5b4
[Tue Jan 17 10:04:03 2023]  process_one_work+0x1dc/0x450
[Tue Jan 17 10:04:03 2023]  worker_thread+0x154/0x450
[Tue Jan 17 10:04:03 2023]  kthread+0x104/0x110
[Tue Jan 17 10:04:03 2023]  ret_from_fork+0x10/0x20
[Tue Jan 17 10:04:09 2023] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 53211 jiffies s: 725 root: 0x4/.
[Tue Jan 17 10:04:09 2023] rcu: blocking rcu_node structures (internal RCU debug):
[Tue Jan 17 10:04:09 2023] Task dump for CPU 2:
[Tue Jan 17 10:04:09 2023] task:kworker/u8:1    state:R  running task     stack:0     pid:39    ppid:2      flags:0x0000000a
[Tue Jan 17 10:04:09 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:04:09 2023] Call trace:
[Tue Jan 17 10:04:09 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:04:09 2023]  0xffffffc00857b8a0
[Tue Jan 17 10:04:17 2023] INFO: task systemd:1 blocked for more than 120 seconds.
[Tue Jan 17 10:04:17 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:04:17 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 10:04:17 2023] task:systemd         state:D stack:0     pid:1     ppid:0      flags:0x00000004
[Tue Jan 17 10:04:17 2023] Call trace:
[Tue Jan 17 10:04:17 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:04:17 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 10:04:17 2023]  schedule+0x60/0x100
[Tue Jan 17 10:04:17 2023]  schedule_timeout+0x17c/0x1c0
[Tue Jan 17 10:04:17 2023]  wait_for_completion+0x7c/0x160
[Tue Jan 17 10:04:17 2023]  __wait_rcu_gp+0x15c/0x190
[Tue Jan 17 10:04:17 2023]  synchronize_rcu+0x8c/0x94
[Tue Jan 17 10:04:17 2023]  rcu_sync_enter+0x60/0x100
[Tue Jan 17 10:04:17 2023]  percpu_down_write+0x34/0x1f4
[Tue Jan 17 10:04:17 2023]  cpuset_write_resmask+0x7c/0xa80
[Tue Jan 17 10:04:17 2023]  cgroup_file_write+0xac/0x1b0
[Tue Jan 17 10:04:17 2023]  kernfs_fop_write_iter+0x120/0x1b0
[Tue Jan 17 10:04:17 2023]  vfs_write+0x2b8/0x35c
[Tue Jan 17 10:04:17 2023]  ksys_write+0x70/0x100
[Tue Jan 17 10:04:17 2023]  __arm64_sys_write+0x24/0x30
[Tue Jan 17 10:04:17 2023]  invoke_syscall+0x50/0x120
[Tue Jan 17 10:04:17 2023]  el0_svc_common.constprop.0+0x68/0x124
[Tue Jan 17 10:04:17 2023]  do_el0_svc+0x38/0xdc
[Tue Jan 17 10:04:17 2023]  el0_svc+0x30/0x94
[Tue Jan 17 10:04:17 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 10:04:17 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 10:04:17 2023] INFO: task exe:2255 blocked for more than 120 seconds.
[Tue Jan 17 10:04:17 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:04:17 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 10:04:17 2023] task:exe             state:D stack:0     pid:2255  ppid:2252   flags:0x0000000c
[Tue Jan 17 10:04:17 2023] Call trace:
[Tue Jan 17 10:04:17 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:04:17 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 10:04:17 2023]  schedule+0x60/0x100
[Tue Jan 17 10:04:17 2023]  synchronize_rcu_expedited+0x420/0x620
[Tue Jan 17 10:04:17 2023]  namespace_unlock+0xc8/0x170
[Tue Jan 17 10:04:17 2023]  path_umount+0x218/0x4e0
[Tue Jan 17 10:04:17 2023]  __arm64_sys_umount+0x8c/0xa0
[Tue Jan 17 10:04:17 2023]  invoke_syscall+0x50/0x120
[Tue Jan 17 10:04:17 2023]  el0_svc_common.constprop.0+0x68/0x124
[Tue Jan 17 10:04:17 2023]  do_el0_svc+0x38/0xdc
[Tue Jan 17 10:04:17 2023]  el0_svc+0x30/0x94
[Tue Jan 17 10:04:17 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 10:04:17 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 10:05:06 2023] rcu: INFO: rcu_preempt self-detected stall on CPU
[Tue Jan 17 10:05:06 2023] rcu:         2-....: (68249 ticks this GP) idle=dbfc/1/0x4000000000000000 softirq=20995/20996 fqs=34109
[Tue Jan 17 10:05:06 2023]      (t=68262 jiffies g=33425 q=456460 ncpus=4)
[Tue Jan 17 10:05:06 2023] CPU: 2 PID: 39 Comm: kworker/u8:1 Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 10:05:06 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 17 10:05:06 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:05:06 2023] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 17 10:05:06 2023] pc : queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:05:06 2023] lr : _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:05:06 2023] sp : ffffffc00857b8a0
[Tue Jan 17 10:05:06 2023] x29: ffffffc00857b8a0 x28: ffffff8100b0e460 x27: 0000000000000400
[Tue Jan 17 10:05:06 2023] x26: ffffff816df5f778 x25: 0000000000000000 x24: 000000007fffffff
[Tue Jan 17 10:05:06 2023] x23: ffffff816df5f660 x22: ffffff816df5f664 x21: 0000000000000002
[Tue Jan 17 10:05:06 2023] x20: ffffff81003b9ec0 x19: ffffff816df5f660 x18: 0000000000000000
[Tue Jan 17 10:05:06 2023] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[Tue Jan 17 10:05:06 2023] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[Tue Jan 17 10:05:06 2023] x11: 00000000000000ef x10: 0000000000001a60 x9 : ffffffe00c3d83c4
[Tue Jan 17 10:05:06 2023] x8 : ffffffc00857b758 x7 : 0000000000000000 x6 : 000000205c04e000
[Tue Jan 17 10:05:06 2023] x5 : ffffffe00cbb9000 x4 : ffffffe00cbb90f0 x3 : 0000000000000006
[Tue Jan 17 10:05:06 2023] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000106
[Tue Jan 17 10:05:06 2023] Call trace:
[Tue Jan 17 10:05:06 2023]  queued_spin_lock_slowpath+0x78/0x490
[Tue Jan 17 10:05:06 2023]  _raw_spin_lock+0x84/0x94
[Tue Jan 17 10:05:06 2023]  __mutex_lock.constprop.0+0xa8/0x604
[Tue Jan 17 10:05:06 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 17 10:05:06 2023]  mutex_lock+0x60/0x70
[Tue Jan 17 10:05:06 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 17 10:05:06 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 17 10:05:06 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 17 10:05:06 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 17 10:05:06 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 17 10:05:06 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 17 10:05:06 2023]  __writeback_inodes_wb+0x58/0x110
[Tue Jan 17 10:05:06 2023]  wb_writeback+0x2d4/0x3b0
[Tue Jan 17 10:05:06 2023]  wb_workfn+0x364/0x5b4
[Tue Jan 17 10:05:06 2023]  process_one_work+0x1dc/0x450
[Tue Jan 17 10:05:06 2023]  worker_thread+0x154/0x450
[Tue Jan 17 10:05:06 2023]  kthread+0x104/0x110
[Tue Jan 17 10:05:06 2023]  ret_from_fork+0x10/0x20
[Tue Jan 17 10:05:12 2023] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 69083 jiffies s: 725 root: 0x4/.
[Tue Jan 17 10:05:12 2023] rcu: blocking rcu_node structures (internal RCU debug):
[Tue Jan 17 10:05:12 2023] Task dump for CPU 2:
[Tue Jan 17 10:05:12 2023] task:kworker/u8:1    state:R  running task     stack:0     pid:39    ppid:2      flags:0x0000000a
[Tue Jan 17 10:05:12 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 17 10:05:12 2023] Call trace:
[Tue Jan 17 10:05:12 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 10:05:12 2023]  0xffffffc00857b8a0
[Tue Jan 17 15:43:53 2023] INFO: task crio:706 blocked for more than 120 seconds.
[Tue Jan 17 15:43:53 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:43:53 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:43:53 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:43:53 2023] Call trace:
[Tue Jan 17 15:43:53 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:43:53 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:43:53 2023]  schedule+0x60/0x100
[Tue Jan 17 15:43:53 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:43:53 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:43:53 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:43:53 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:43:53 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:43:53 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:43:53 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:43:53 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:43:53 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:43:53 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:43:53 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:43:53 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:43:53 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:43:53 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:43:53 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:43:53 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:43:53 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:43:53 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:45:54 2023] INFO: task crio:706 blocked for more than 241 seconds.
[Tue Jan 17 15:45:54 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:45:54 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:45:54 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:45:54 2023] Call trace:
[Tue Jan 17 15:45:54 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:45:54 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:45:54 2023]  schedule+0x60/0x100
[Tue Jan 17 15:45:54 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:45:54 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:45:54 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:45:54 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:45:54 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:45:54 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:45:54 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:45:54 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:45:54 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:45:54 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:45:54 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:45:54 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:45:54 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:45:54 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:45:54 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:45:54 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:45:54 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:45:54 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:47:55 2023] INFO: task crio:706 blocked for more than 362 seconds.
[Tue Jan 17 15:47:55 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:47:55 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:47:55 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:47:55 2023] Call trace:
[Tue Jan 17 15:47:55 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:47:55 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:47:55 2023]  schedule+0x60/0x100
[Tue Jan 17 15:47:55 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:47:55 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:47:55 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:47:55 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:47:55 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:47:55 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:47:55 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:47:55 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:47:55 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:47:55 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:47:55 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:47:55 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:47:55 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:47:55 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:47:55 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:47:55 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:47:55 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:47:55 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:49:56 2023] INFO: task crio:706 blocked for more than 483 seconds.
[Tue Jan 17 15:49:56 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:49:56 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:49:56 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:49:56 2023] Call trace:
[Tue Jan 17 15:49:56 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:49:56 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:49:56 2023]  schedule+0x60/0x100
[Tue Jan 17 15:49:56 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:49:56 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:49:56 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:49:56 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:49:56 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:49:56 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:49:56 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:49:56 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:49:56 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:49:56 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:49:56 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:49:56 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:49:56 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:49:56 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:49:56 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:49:56 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:49:56 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:49:56 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:51:56 2023] INFO: task crio:706 blocked for more than 604 seconds.
[Tue Jan 17 15:51:56 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:51:57 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:51:57 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:51:57 2023] Call trace:
[Tue Jan 17 15:51:57 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:51:57 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:51:57 2023]  schedule+0x60/0x100
[Tue Jan 17 15:51:57 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:51:57 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:51:57 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:51:57 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:51:57 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:51:57 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:51:57 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:51:57 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:51:57 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:51:57 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:51:57 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:51:57 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:51:57 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:51:57 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:51:57 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:51:57 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:51:57 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:51:57 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:53:57 2023] INFO: task crio:706 blocked for more than 724 seconds.
[Tue Jan 17 15:53:57 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:53:57 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:53:57 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:53:57 2023] Call trace:
[Tue Jan 17 15:53:57 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:53:57 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:53:57 2023]  schedule+0x60/0x100
[Tue Jan 17 15:53:57 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:53:57 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:53:57 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:53:57 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:53:57 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:53:57 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:53:57 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:53:57 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:53:57 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:53:57 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:53:57 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:53:57 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:53:57 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:53:57 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:53:57 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:53:57 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:53:57 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:53:57 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:55:58 2023] INFO: task crio:706 blocked for more than 845 seconds.
[Tue Jan 17 15:55:58 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:55:58 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:55:58 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:55:58 2023] Call trace:
[Tue Jan 17 15:55:58 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:55:58 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:55:58 2023]  schedule+0x60/0x100
[Tue Jan 17 15:55:58 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:55:58 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:55:58 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:55:58 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:55:58 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:55:58 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:55:58 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:55:58 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:55:58 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:55:58 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:55:58 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:55:58 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:55:58 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:55:58 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:55:58 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:55:58 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:55:58 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:55:58 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 15:57:59 2023] INFO: task crio:706 blocked for more than 966 seconds.
[Tue Jan 17 15:57:59 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 15:57:59 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 15:57:59 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 15:57:59 2023] Call trace:
[Tue Jan 17 15:57:59 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 15:57:59 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 15:57:59 2023]  schedule+0x60/0x100
[Tue Jan 17 15:57:59 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 15:57:59 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 15:57:59 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 15:57:59 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 15:57:59 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 15:57:59 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 15:57:59 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 15:57:59 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 15:57:59 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 15:57:59 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 15:57:59 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 15:57:59 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 15:57:59 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 15:57:59 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 15:57:59 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 15:57:59 2023]  el0_svc+0x80/0x94
[Tue Jan 17 15:57:59 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 15:57:59 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 16:00:00 2023] INFO: task crio:706 blocked for more than 1087 seconds.
[Tue Jan 17 16:00:00 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 16:00:00 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 16:00:00 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 16:00:00 2023] Call trace:
[Tue Jan 17 16:00:00 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 16:00:00 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 16:00:00 2023]  schedule+0x60/0x100
[Tue Jan 17 16:00:00 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 16:00:00 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 16:00:00 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 16:00:00 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 16:00:00 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 16:00:00 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 16:00:00 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 16:00:00 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 16:00:00 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 16:00:00 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 16:00:00 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 16:00:00 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 16:00:00 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 16:00:00 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 16:00:00 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 16:00:00 2023]  el0_svc+0x80/0x94
[Tue Jan 17 16:00:00 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 16:00:00 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 17 16:02:01 2023] INFO: task crio:706 blocked for more than 1208 seconds.
[Tue Jan 17 16:02:01 2023]       Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 17 16:02:01 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jan 17 16:02:01 2023] task:crio            state:D stack:0     pid:706   ppid:1      flags:0x0000000c
[Tue Jan 17 16:02:01 2023] Call trace:
[Tue Jan 17 16:02:01 2023]  __switch_to+0xf4/0x170
[Tue Jan 17 16:02:01 2023]  __schedule+0x2d8/0x9d0
[Tue Jan 17 16:02:01 2023]  schedule+0x60/0x100
[Tue Jan 17 16:02:01 2023]  wb_wait_for_completion+0x9c/0xd0
[Tue Jan 17 16:02:01 2023]  sync_inodes_sb+0xb8/0x230
[Tue Jan 17 16:02:01 2023]  sync_filesystem+0x7c/0xc0
[Tue Jan 17 16:02:01 2023]  vfsub_sync_filesystem+0x30/0x50 [aufs]
[Tue Jan 17 16:02:01 2023]  aufs_sync_fs+0x78/0x120 [aufs]
[Tue Jan 17 16:02:01 2023]  sync_filesystem+0x64/0xc0
[Tue Jan 17 16:02:01 2023]  generic_shutdown_super+0x34/0x12c
[Tue Jan 17 16:02:01 2023]  kill_anon_super+0x20/0x40
[Tue Jan 17 16:02:01 2023]  aufs_kill_sb+0x12c/0x230 [aufs]
[Tue Jan 17 16:02:01 2023]  deactivate_locked_super+0x4c/0xec
[Tue Jan 17 16:02:01 2023]  deactivate_super+0x90/0xb0
[Tue Jan 17 16:02:01 2023]  cleanup_mnt+0xa0/0x130
[Tue Jan 17 16:02:01 2023]  __cleanup_mnt+0x1c/0x2c
[Tue Jan 17 16:02:01 2023]  task_work_run+0x88/0xe0
[Tue Jan 17 16:02:01 2023]  do_notify_resume+0x210/0x1260
[Tue Jan 17 16:02:01 2023]  el0_svc+0x80/0x94
[Tue Jan 17 16:02:01 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 17 16:02:01 2023]  el0t_64_sync+0x18c/0x190
# cat /proc/290/stack
[<0>] wait_on_commit+0x88/0xcc
[<0>] __nfs_commit_inode+0x12c/0x1a0
[<0>] nfs_write_inode+0x80/0x9c
[<0>] nfs4_write_inode+0x24/0x60
[<0>] __writeback_single_inode+0x368/0x4b0
[<0>] writeback_sb_inodes+0x214/0x49c
[<0>] wb_writeback+0xf4/0x3b0
[<0>] wb_workfn+0xec/0x5b4
[<0>] process_one_work+0x1dc/0x450
[<0>] worker_thread+0x154/0x450
[<0>] kthread+0x104/0x110
[<0>] ret_from_fork+0x10/0x20

# ps -e -opid,lstart,s,comm,cmd | sort -k7,7 | head
    PID                  STARTED S COMMAND         CMD
    290 Tue Jan 17 15:41:36 2023 D kworker/u8:8+fl [kworker/u8:8+flush-0:28]
      3 Tue Jan 17 15:41:24 2023 I rcu_gp          [rcu_gp]
      4 Tue Jan 17 15:41:24 2023 I rcu_par_gp      [rcu_par_gp]
      5 Tue Jan 17 15:41:24 2023 I slub_flushwq    [slub_flushwq]
      6 Tue Jan 17 15:41:24 2023 I netns           [netns]
      8 Tue Jan 17 15:41:24 2023 I kworker/0:0H-ev [kworker/0:0H-events_highpri]
      9 Tue Jan 17 15:41:24 2023 I kworker/u8:0-nf [kworker/u8:0-nfsiod]
     10 Tue Jan 17 15:41:24 2023 I mm_percpu_wq    [mm_percpu_wq]
     11 Tue Jan 17 15:41:24 2023 I rcu_tasks_kthre [rcu_tasks_kthread]

Let me know what other information you need from me.

sfjro commented 1 year ago

Hello Cory,

Thanx for the report.

Cory Coager:

I'm seeing kernel crashes on 5.15, 6.0 and 6.1. I'm using Raspberry Pi's on NFS netboot with AUFS storage on Crio/Kubernetes.

These logs show

So the point is why nfs4 stopped mutex. I see mutex_lock(&NFS_I(cinfo->inode)->commit_mutex); in v6.1 fs/nfs/write.c:nfs_scan_commit(), and I guess this is the point.

I don't think other than nfs4 touches this commit_mutex, and I believe nfs4 is the thing to investigate first. If you can, try writing a small program to issue 'syncfs("/path/to/your/nfs4/branch")' and see what will happen.

Let me know what other information you need from me.

I want these.

(from aufs README file) When you have any problems or strange behaviour in aufs, please let me know with:

Also it MAY help a little to press MagicSysRq + d, l, and t. It is just a little help since you already gathered the stack traces, But 'd' may be able to help us.

J. R. Okajima

sfjro commented 1 year ago

Cory Coager:

I'm seeing kernel crashes on 5.15, 6.0 and 6.1. I'm using Raspberry Pi's on NFS netboot with AUFS storage on Crio/Kubernetes.

I might not get the point correctly. Do you mean the first bad thing is this, and all stack traces showed up much later?

[Thu Dec 8 19:12:39 2022] Unable to handle kernel paging request at virtual address 0001010201019004 [Thu Dec 8 19:12:39 2022] Mem abort info: [Thu Dec 8 19:12:39 2022] ESR = 0x0000000096000004 [Thu Dec 8 19:12:39 2022] EC = 0x25: DABT (current EL), IL = 32 bits [Thu Dec 8 19:12:39 2022] SET = 0, FnV = 0 [Thu Dec 8 19:12:39 2022] EA = 0, S1PTW = 0 [Thu Dec 8 19:12:39 2022] FSC = 0x04: level 0 translation fault [Thu Dec 8 19:12:39 2022] Data abort info: [Thu Dec 8 19:12:39 2022] ISV = 0, ISS = 0x00000004 [Thu Dec 8 19:12:39 2022] CM = 0, WnR = 0 [Thu Dec 8 19:12:39 2022] [0001010201019004] address between user and kernel address ranges [Thu Dec 8 19:12:39 2022] Internal error: Oops: 96000004 [#1] PREEMPT SMP

Then the important thing is this paging issue rather than stack traces. Who (I mean which process or white operation) triggered this log?

J. R. Okajima

ccoager commented 1 year ago

config-6.1.5-v8+.txt proc-mounts.txt sys-module-aufs.txt sys-fs-aufs.txt

ccoager commented 1 year ago

Caught a crash tonight...

[Tue Jan 24 19:21:10 2023] Unable to handle kernel paging request at virtual address 000000a2b9400394
[Tue Jan 24 19:21:10 2023] Mem abort info:
[Tue Jan 24 19:21:10 2023]   ESR = 0x0000000096000004
[Tue Jan 24 19:21:10 2023]   EC = 0x25: DABT (current EL), IL = 32 bits
[Tue Jan 24 19:21:10 2023]   SET = 0, FnV = 0
[Tue Jan 24 19:21:10 2023]   EA = 0, S1PTW = 0
[Tue Jan 24 19:21:10 2023]   FSC = 0x04: level 0 translation fault
[Tue Jan 24 19:21:10 2023] Data abort info:
[Tue Jan 24 19:21:10 2023]   ISV = 0, ISS = 0x00000004
[Tue Jan 24 19:21:10 2023]   CM = 0, WnR = 0
[Tue Jan 24 19:21:10 2023] [000000a2b9400394] address between user and kernel address ranges
[Tue Jan 24 19:21:10 2023] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[Tue Jan 24 19:21:10 2023] Modules linked in: xt_multiport xt_set ipt_rpfilter ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel wireguard libchacha20poly1305 chacha_neon poly1305_neon ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha veth nf_conntrack_netlink xt_statistic xt_nat xt_addrtype ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs rpcsec_gss_krb5 xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 binfmt_misc xt_comment nft_compat nf_tables nfnetlink aufs vc4 snd_soc_hdmi_codec drm_display_helper cec drm_dma_helper joydev drm_kms_helper snd_soc_core brcmfmac brcmutil snd_compress snd_bcm2835(C) snd_pcm_dmaengine snd_pcm cfg80211 snd_timer rfkill rpivid_hevc(C) bcm2835_codec(C) bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) v4l2_mem2mem v3d videobuf2_dma_contig snd videobuf2_vmalloc videobuf2_memops pwm_raspberrypi_poe videobuf2_v4l2 raspberrypi_hwmon videobuf2_common vc_sm_cma(C) syscopyarea gpu_sched
[Tue Jan 24 19:21:10 2023]  sysfillrect videodev sysimgblt mc fb_sys_fops drm_shmem_helper uio_pdrv_genirq nvmem_rmem uio pwm_fan sch_fq_codel br_netfilter bridge stp llc drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6 spidev i2c_bcm2835 spi_bcm2835
[Tue Jan 24 19:21:10 2023] CPU: 3 PID: 47 Comm: kworker/u8:2 Tainted: G         C         6.1.5-v8+ #8
[Tue Jan 24 19:21:10 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 24 19:21:10 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 24 19:21:10 2023] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 24 19:21:10 2023] pc : __mutex_lock.constprop.0+0x80/0x604
[Tue Jan 24 19:21:10 2023] lr : __mutex_lock.constprop.0+0x48/0x604
[Tue Jan 24 19:21:10 2023] sp : ffffffc0085bb970
[Tue Jan 24 19:21:10 2023] x29: ffffffc0085bb970 x28: ffffff81026c4460 x27: 7fffffffffffffff
[Tue Jan 24 19:21:10 2023] x26: ffffff8164674108 x25: 0000000000000000 x24: 000000007fffffff
[Tue Jan 24 19:21:10 2023] x23: 0000000000000000 x22: ffffff8164674080 x21: 0000000000000002
[Tue Jan 24 19:21:10 2023] x20: ffffffc0085bba70 x19: ffffff8164673fe8 x18: 00000000dc6ea76a
[Tue Jan 24 19:21:10 2023] x17: 0000000000000001 x16: 0000a666386d735e x15: 022bf2782e921fd2
[Tue Jan 24 19:21:10 2023] x14: ffffffdce780bbe8 x13: 000000000204f835 x12: 0000000000000000
[Tue Jan 24 19:21:10 2023] x11: ffffffffffffffc0 x10: 0000000000000001 x9 : ffffffdce77d232c
[Tue Jan 24 19:21:10 2023] x8 : ffffffc0085bba48 x7 : 0000000000000000 x6 : 0000000000000000
[Tue Jan 24 19:21:10 2023] x5 : ffffffc0085bba70 x4 : ffffff8100b70000 x3 : 350000a2b9400360
[Tue Jan 24 19:21:10 2023] x2 : ffffff8100b70000 x1 : 350000a2b9400362 x0 : 350000a2b9400360
[Tue Jan 24 19:21:10 2023] Call trace:
[Tue Jan 24 19:21:10 2023]  __mutex_lock.constprop.0+0x80/0x604
[Tue Jan 24 19:21:10 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 24 19:21:10 2023]  mutex_lock+0x60/0x70
[Tue Jan 24 19:21:10 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 24 19:21:10 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 24 19:21:10 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 24 19:21:10 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 24 19:21:10 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 24 19:21:10 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 24 19:21:10 2023]  wb_writeback+0xf4/0x3b0
[Tue Jan 24 19:21:10 2023]  wb_workfn+0xec/0x5b4
[Tue Jan 24 19:21:10 2023]  process_one_work+0x1dc/0x450
[Tue Jan 24 19:21:10 2023]  worker_thread+0x154/0x450
[Tue Jan 24 19:21:10 2023]  kthread+0x104/0x110
[Tue Jan 24 19:21:10 2023]  ret_from_fork+0x10/0x20
[Tue Jan 24 19:21:10 2023] Code: 540000c1 f9400260 f27df000 54001f20 (b9403401)
[Tue Jan 24 19:21:10 2023] ---[ end trace 0000000000000000 ]---
[Tue Jan 24 19:21:10 2023] note: kworker/u8:2[47] exited with preempt_count 1
[Tue Jan 24 19:21:10 2023] ------------[ cut here ]------------
[Tue Jan 24 19:21:10 2023] WARNING: CPU: 0 PID: 47 at kernel/exit.c:765 do_exit+0x828/0x9e0
[Tue Jan 24 19:21:10 2023] Modules linked in: xt_multiport xt_set ipt_rpfilter ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel wireguard libchacha20poly1305 chacha_neon poly1305_neon ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha veth nf_conntrack_netlink xt_statistic xt_nat xt_addrtype ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs rpcsec_gss_krb5 xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 binfmt_misc xt_comment nft_compat nf_tables nfnetlink aufs vc4 snd_soc_hdmi_codec drm_display_helper cec drm_dma_helper joydev drm_kms_helper snd_soc_core brcmfmac brcmutil snd_compress snd_bcm2835(C) snd_pcm_dmaengine snd_pcm cfg80211 snd_timer rfkill rpivid_hevc(C) bcm2835_codec(C) bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) v4l2_mem2mem v3d videobuf2_dma_contig snd videobuf2_vmalloc videobuf2_memops pwm_raspberrypi_poe videobuf2_v4l2 raspberrypi_hwmon videobuf2_common vc_sm_cma(C) syscopyarea gpu_sched
[Tue Jan 24 19:21:10 2023]  sysfillrect videodev sysimgblt mc fb_sys_fops drm_shmem_helper uio_pdrv_genirq nvmem_rmem uio pwm_fan sch_fq_codel br_netfilter bridge stp llc drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6 spidev i2c_bcm2835 spi_bcm2835
[Tue Jan 24 19:21:11 2023] CPU: 0 PID: 47 Comm: kworker/u8:2 Tainted: G      D  C         6.1.5-v8+ #8
[Tue Jan 24 19:21:11 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 24 19:21:11 2023] Workqueue: writeback wb_workfn (flush-0:28)
[Tue Jan 24 19:21:11 2023] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jan 24 19:21:11 2023] pc : do_exit+0x828/0x9e0
[Tue Jan 24 19:21:11 2023] lr : do_exit+0x64/0x9e0
[Tue Jan 24 19:21:11 2023] sp : ffffffc0085bb5b0
[Tue Jan 24 19:21:11 2023] x29: ffffffc0085bb5b0 x28: ffffffc0085bb6c3 x27: ffffffdce7a468a0
[Tue Jan 24 19:21:11 2023] x26: ffffffdce7a46898 x25: 0000000000000001 x24: ffffffdce77d2374
[Tue Jan 24 19:21:11 2023] x23: 0000000000000000 x22: 000000000000000b x21: ffffff8100b78000
[Tue Jan 24 19:21:11 2023] x20: ffffff8100b80000 x19: ffffff8100b70000 x18: 0000000000000000
[Tue Jan 24 19:21:11 2023] x17: 0000000000000000 x16: 0000000000000000 x15: ffffff81fefd10c0
[Tue Jan 24 19:21:11 2023] x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000000
[Tue Jan 24 19:21:11 2023] x11: 0000000000000000 x10: 0000000000001a60 x9 : ffffffdce77d8018
[Tue Jan 24 19:21:11 2023] x8 : ffffffc0085bb3e8 x7 : 0000000000000000 x6 : 0000000e31c6e800
[Tue Jan 24 19:21:11 2023] x5 : ffffffdce7fb9000 x4 : ffffffdce7fb90f0 x3 : 0000000000000000
[Tue Jan 24 19:21:11 2023] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffffc0085bbca0
[Tue Jan 24 19:21:11 2023] Call trace:
[Tue Jan 24 19:21:11 2023]  do_exit+0x828/0x9e0
[Tue Jan 24 19:21:11 2023]  make_task_dead+0x64/0x110
[Tue Jan 24 19:21:11 2023]  die+0x1dc/0x218
[Tue Jan 24 19:21:11 2023]  die_kernel_fault+0x280/0x338
[Tue Jan 24 19:21:11 2023]  __do_kernel_fault+0x120/0x1c0
[Tue Jan 24 19:21:11 2023]  do_translation_fault+0x58/0xdc
[Tue Jan 24 19:21:11 2023]  do_mem_abort+0x4c/0xa0
[Tue Jan 24 19:21:11 2023]  el1_abort+0x44/0x74
[Tue Jan 24 19:21:11 2023]  el1h_64_sync_handler+0xd8/0xe4
[Tue Jan 24 19:21:11 2023]  el1h_64_sync+0x64/0x68
[Tue Jan 24 19:21:11 2023]  __mutex_lock.constprop.0+0x80/0x604
[Tue Jan 24 19:21:11 2023]  __mutex_lock_slowpath+0x1c/0x30
[Tue Jan 24 19:21:11 2023]  mutex_lock+0x60/0x70
[Tue Jan 24 19:21:11 2023]  nfs_scan_commit.part.0.isra.0+0x2c/0xc4
[Tue Jan 24 19:21:11 2023]  __nfs_commit_inode+0xa8/0x1a0
[Tue Jan 24 19:21:11 2023]  nfs_write_inode+0x44/0x9c
[Tue Jan 24 19:21:11 2023]  nfs4_write_inode+0x24/0x60
[Tue Jan 24 19:21:11 2023]  __writeback_single_inode+0x368/0x4b0
[Tue Jan 24 19:21:11 2023]  writeback_sb_inodes+0x214/0x49c
[Tue Jan 24 19:21:11 2023]  wb_writeback+0xf4/0x3b0
[Tue Jan 24 19:21:11 2023]  wb_workfn+0xec/0x5b4
[Tue Jan 24 19:21:11 2023]  process_one_work+0x1dc/0x450
[Tue Jan 24 19:21:11 2023]  worker_thread+0x154/0x450
[Tue Jan 24 19:21:11 2023]  kthread+0x104/0x110
[Tue Jan 24 19:21:11 2023]  ret_from_fork+0x10/0x20
[Tue Jan 24 19:21:11 2023] ---[ end trace 0000000000000000 ]---
ccoager commented 1 year ago

show-backtrace-all-active-cpus:

[Tue Jan 24 19:33:12 2023] sysrq: Show backtrace of all active CPUs
[Tue Jan 24 19:33:12 2023] BUG: using smp_processor_id() in preemptible [00000000] code: bash/2263
[Tue Jan 24 19:33:12 2023] caller is debug_smp_processor_id+0x20/0x2c
[Tue Jan 24 19:33:12 2023] CPU: 2 PID: 2263 Comm: bash Tainted: G      D WC         6.1.5-v8+ #8
[Tue Jan 24 19:33:12 2023] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Tue Jan 24 19:33:12 2023] Call trace:
[Tue Jan 24 19:33:12 2023]  dump_backtrace.part.0+0xe8/0xf4
[Tue Jan 24 19:33:12 2023]  show_stack+0x20/0x30
[Tue Jan 24 19:33:12 2023]  dump_stack_lvl+0x8c/0xb8
[Tue Jan 24 19:33:12 2023]  dump_stack+0x18/0x34
[Tue Jan 24 19:33:12 2023]  check_preemption_disabled+0x118/0x124
[Tue Jan 24 19:33:12 2023]  debug_smp_processor_id+0x20/0x2c
[Tue Jan 24 19:33:12 2023]  sysrq_handle_showallcpus+0x28/0xc4
[Tue Jan 24 19:33:12 2023]  __handle_sysrq+0x94/0x1a0
[Tue Jan 24 19:33:12 2023]  write_sysrq_trigger+0x7c/0xa0
[Tue Jan 24 19:33:12 2023]  proc_reg_write+0xac/0x100
[Tue Jan 24 19:33:12 2023]  vfs_write+0xd8/0x35c
[Tue Jan 24 19:33:12 2023]  ksys_write+0x70/0x100
[Tue Jan 24 19:33:12 2023]  __arm64_sys_write+0x24/0x30
[Tue Jan 24 19:33:12 2023]  invoke_syscall+0x50/0x120
[Tue Jan 24 19:33:12 2023]  el0_svc_common.constprop.0+0x68/0x124
[Tue Jan 24 19:33:12 2023]  do_el0_svc+0x38/0xdc
[Tue Jan 24 19:33:12 2023]  el0_svc+0x30/0x94
[Tue Jan 24 19:33:12 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 24 19:33:12 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 24 19:33:12 2023] sysrq: CPU2:
[Tue Jan 24 19:33:12 2023] Call trace:
[Tue Jan 24 19:33:12 2023]  dump_backtrace.part.0+0xe8/0xf4
[Tue Jan 24 19:33:12 2023]  show_stack+0x20/0x30
[Tue Jan 24 19:33:12 2023]  sysrq_handle_showallcpus+0x4c/0xc4
[Tue Jan 24 19:33:12 2023]  __handle_sysrq+0x94/0x1a0
[Tue Jan 24 19:33:12 2023]  write_sysrq_trigger+0x7c/0xa0
[Tue Jan 24 19:33:12 2023]  proc_reg_write+0xac/0x100
[Tue Jan 24 19:33:12 2023]  vfs_write+0xd8/0x35c
[Tue Jan 24 19:33:12 2023]  ksys_write+0x70/0x100
[Tue Jan 24 19:33:12 2023]  __arm64_sys_write+0x24/0x30
[Tue Jan 24 19:33:12 2023]  invoke_syscall+0x50/0x120
[Tue Jan 24 19:33:12 2023]  el0_svc_common.constprop.0+0x68/0x124
[Tue Jan 24 19:33:12 2023]  do_el0_svc+0x38/0xdc
[Tue Jan 24 19:33:12 2023]  el0_svc+0x30/0x94
[Tue Jan 24 19:33:12 2023]  el0t_64_sync_handler+0xbc/0x13c
[Tue Jan 24 19:33:12 2023]  el0t_64_sync+0x18c/0x190
[Tue Jan 24 19:33:12 2023] sysrq: CPU1:
[Tue Jan 24 19:33:12 2023] sysrq: CPU3: backtrace skipped as idling
[Tue Jan 24 19:33:12 2023] Call trace:
[Tue Jan 24 19:33:12 2023]  dump_backtrace.part.0+0xe8/0xf4
[Tue Jan 24 19:33:12 2023]  show_stack+0x20/0x30
[Tue Jan 24 19:33:12 2023]  showacpu+0x60/0x9c
[Tue Jan 24 19:33:12 2023]  __flush_smp_call_function_queue+0xe4/0x260
[Tue Jan 24 19:33:12 2023]  generic_smp_call_function_single_interrupt+0x1c/0x30
[Tue Jan 24 19:33:12 2023]  ipi_handler+0x98/0x300
[Tue Jan 24 19:33:12 2023]  handle_percpu_devid_irq+0xac/0x240
[Tue Jan 24 19:33:12 2023]  generic_handle_domain_irq+0x34/0x50
[Tue Jan 24 19:33:12 2023]  gic_handle_irq+0x4c/0xe0
[Tue Jan 24 19:33:12 2023]  call_on_irq_stack+0x2c/0x60
[Tue Jan 24 19:33:12 2023]  do_interrupt_handler+0xdc/0xe0
[Tue Jan 24 19:33:12 2023]  el0_interrupt+0x50/0x100
[Tue Jan 24 19:33:12 2023]  __el0_irq_handler_common+0x18/0x24
[Tue Jan 24 19:33:12 2023]  el0t_64_irq_handler+0x10/0x20
[Tue Jan 24 19:33:12 2023]  el0t_64_irq+0x18c/0x190
[Tue Jan 24 19:33:12 2023] sysrq: CPU0:
[Tue Jan 24 19:33:12 2023] Call trace:
[Tue Jan 24 19:33:12 2023]  dump_backtrace.part.0+0xe8/0xf4
[Tue Jan 24 19:33:12 2023]  show_stack+0x20/0x30
[Tue Jan 24 19:33:12 2023]  showacpu+0x60/0x9c
[Tue Jan 24 19:33:12 2023]  __flush_smp_call_function_queue+0xe4/0x260
[Tue Jan 24 19:33:12 2023]  generic_smp_call_function_single_interrupt+0x1c/0x30
[Tue Jan 24 19:33:12 2023]  ipi_handler+0x98/0x300
[Tue Jan 24 19:33:12 2023]  handle_percpu_devid_irq+0xac/0x240
[Tue Jan 24 19:33:12 2023]  generic_handle_domain_irq+0x34/0x50
[Tue Jan 24 19:33:12 2023]  gic_handle_irq+0x4c/0xe0
[Tue Jan 24 19:33:12 2023]  call_on_irq_stack+0x2c/0x60
[Tue Jan 24 19:33:12 2023]  do_interrupt_handler+0xdc/0xe0
[Tue Jan 24 19:33:12 2023]  el0_interrupt+0x50/0x100
[Tue Jan 24 19:33:12 2023]  __el0_irq_handler_common+0x18/0x24
[Tue Jan 24 19:33:12 2023]  el0t_64_irq_handler+0x10/0x20
[Tue Jan 24 19:33:12 2023]  el0t_64_irq+0x18c/0x190
ccoager commented 1 year ago

show-task-states: show-task-states.txt

ccoager commented 1 year ago
I might not get the point correctly.
Do you mean the first bad thing is this, and all stack traces showed up
much later?

The "Unable to handle kernel paging request at virtual address" is a kernel crash and triggers a reboot. The crashes are intermittent but happen often enough that I can reproduce it. I have 6 RPi's and they are all crashing. The crashes seem to happen when the kubelet service first starts and there is a chance that it crashes soon after or doesn't crash at all.

Current I am using kernel 6.1.5-v8+ from Git source: https://github.com/raspberrypi/linux root@artemis:~# uname -a Linux artemis 6.1.5-v8+ #8 SMP PREEMPT Sat Jan 14 02:27:34 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

root@artemis:~# modinfo aufs filename: /lib/modules/6.1.5-v8+/kernel/fs/aufs/aufs.ko.xz alias: fs-aufs version: 6.1-20230109 description: aufs -- Advanced multi layered unification filesystem author: Junjiro R. Okajima aufs-users@lists.sourceforge.net license: GPL srcversion: A880D04257372558DC0FB9B depends: intree: Y name: aufs vermagic: 6.1.5-v8+ SMP preempt modunload modversions aarch64 parm: brs:use /fs/aufs/si*/brN (int) parm: allow_userns:allow unprivileged to mount under userns (bool)

No LSM installed.

I didn't see a Magic Sysrq (d) command.

Let me know if I forgot anything or if you need anything else.

sfjro commented 1 year ago

Cory Coager:

Current I am using kernel 6.1.5-v8+ from Git source: https://github.com/raspberrypi/linux

There are many branches and tags in that repository. Which one are you using?

And thanx for uploading many files.

J. R. Okajima

sfjro commented 1 year ago

Cory Coager:

Let me know if I forgot anything or if you need anything else.

Could you try this?

If you can, try writing a small program to issue 'syncfs("/path/to/your/nfs4/branch")' and see what will happen.

J. R. Okajima

ccoager commented 1 year ago
There are many branches and tags in that repository.
Which one are you using?

Branch rpi-6.1.y

sfjro commented 1 year ago

Cory Coager:

Branch rpi-6.1.y

Ok, I've git-pull-ed. The latest rpi-6.1.y is linux-v6.1.8 and yours is v6.1.5. Assuming this difference is not a big deal, I took a glance at nfs4 files and found nothing suspicious.

Unfortunately I cannot use nfs4 on my test environment. I don't know why but nfs4 stopped working since around linux-v5.17.

Honestly speaking, I have nothing left to investigate. If there is one thing left, it is apply this patch and run 'syncfs("/path/to/your/nfs4/branch")' BEFORE mounting aufs.

diff --git a/fs/nfs/write.c b/fs/nfs/write.c index f41d24b54fd1..f2b42ca1020a 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1073,6 +1073,7 @@ nfs_scan_commit(struct inode inode, struct list_head dst,

if (!atomic_long_read(&cinfo->mds->ncommit))
    return 0;

J. R. Okajima

ccoager commented 1 year ago
Unfortunately I cannot use nfs4 on my test environment. I don't know why
but nfs4 stopped working since around linux-v5.17.

The website says 6.x+ is "supported and fully tested" so I assumed NFSv4 was working. Is NFSv3 supposed to work? Because I am getting the same kernel Oops on NFSv3 (with a slightly different message).

sfjro commented 1 year ago

Cory Coager:

The website says 6.x+ is "supported and fully tested" so I assumed NFSv4 was working. Is NFSv3 supposed to work? Because I am getting the same kernel Oops on NFSv3 (with a slightly different message).

I should have updated it, sorry. As you see, it says "SIGIO and nfs4 don't work expectedly" since v5.18 and some v5.1[05] stable versions. For nfs3, I can test on my side. How does the message differ?

J. R. Okajima

ccoager commented 1 year ago
For nfs3, I can test on my side.
How does the message differ?

The message is basically the same but doesn't mention nfs4.

I had the same kernel Oops on kernel 5.15 in my previous testing. Where should I go from here? Try the patch?

sfjro commented 1 year ago

Cory Coager:

I had the same kernel Oops on kernel 5.15 in my previous testing. Where should I go from here? Try the patch?

The patch I sent doesn't fix anything. It prints a debug message if nfs[34] is bad. But the patch and syncfs(2) for your nfs branch is what I want to try first.

The writable nfs3 branch always passed my local tests. But I will try rpi-v6.1.y with nfs3 tomorrow.

J. R. Okajima

sfjro commented 1 year ago

"J. R. Okajima":

The writable nfs3 branch always passed my local tests. But I will try rpi-v6.1.y with nfs3 tomorrow.

I tried.

It seems that the possibility of rpi-v6.1.y affects the behaviour of nfs is high, but I couldn't find any suspicious code in rpi-v6.1.y. If you tried syncfs(2) on your nfs branch, it would be a good evidence to indicate that you had better ask nfs and rpi people.

J. R. Okajima

ccoager commented 1 year ago

I recompiled the kernel (6.1.8-v8+) with the latest changes and the NFS patch you gave me. I did a sync before starting the service. Here are the relevant logs. Let me know if you need anything else. kernel-oops-sysrq.txt config-6.1.8-v8+.txt proc-mounts.txt sys-module-aufs.txt sys-fs-aufs.txt

sfjro commented 1 year ago

Cory Coager:

I recompiled the kernel (6.1.8-v8+) with the latest changes and the NFS patch you gave me. I did a sync before starting the service. Here are the relevant logs. Let me know if you need anything else.

Thanx for the test, but the patch I sent should not be called NFS patch. It was just to produce a warning when NFS handles an inode incorrectly. And you got the warning

[Thu Jan 26 09:13:28 2023] ------------[ cut here ]------------ [Thu Jan 26 09:13:28 2023] WARNING: CPU: 3 PID: 216 at fs/nfs/write.c:1076 nfs_scan_commit.part.0.isra.0+0xbc/0xc4

So I'd strongly suggest you to consult NFS or RaspberryPi people. It looks aufs is unrelated.

J. R. Okajima