oxidecomputer / propolis

VMM userspace for illumos bhyve
Mozilla Public License 2.0
178 stars 22 forks source link

`rcu_sched detected stalls` after local live migration #376

Open gjcolombo opened 1 year ago

gjcolombo commented 1 year ago

Propolis commit: c455784

Host OS:

$ cat /etc/versions/build
heads/master-0-g717646f711

Guest OS: Debian 11 nocloud, Linux debian 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux

Repro steps:

Expected: guest is generally happy Observed: guest gets dyspepsia after running the timer stress test:

root@debian:~# [  501.250937] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  501.254890] rcu:     1-...!: (0 ticks this GP) idle=7f0/0/0x0 softirq=2364/2364 fqs=1  (false positive?)
[  501.254890]  (detected by 3, t=21009 jiffies, g=1749, q=1077)
[  501.254890] Sending NMI from CPU 3 to CPUs 1:
[  501.266638] NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x20
[  501.254890] rcu: rcu_sched kthread starved for 15759 jiffies! g1749 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[  501.254890] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  501.254890] rcu: RCU grace-period kthread stack dump:
[  501.254890] task:rcu_sched       state:I stack:    0 pid:   12 ppid:     2 flags:0x00004000
[  501.254890] Call Trace:
[  501.254890]  __schedule+0x282/0x870
[  501.254890]  schedule+0x46/0xb0
[  501.254890]  schedule_timeout+0x8b/0x150
[  501.254890]  ? __next_timer_interrupt+0x110/0x110
[  501.254890]  rcu_gp_kthread+0x51b/0xbc0
[  501.254890]  ? rcu_cpu_kthread+0x190/0x190
[  501.254890]  kthread+0x11b/0x140
[  501.254890]  ? __kthread_bind_mask+0x60/0x60
[  501.254890]  ret_from_fork+0x22/0x30

Other observations:

@jmpesp saw a similar issue in local testing earlier this week, but that was without the bits needed to enable the interrupt state transfer implemented in #367. Unless I've missed something, that should have been enabled here (both the Propolis bits and the necessary bhyve bits were present).

This VM no longer seems to be producing any RCU complaints, but I'll hold it in its current state for now.

jmpesp commented 1 year ago

Note I also see this at

james@dinnerbone:~$ cat /etc/versions/build
heads/stlouis-0-gce8e8e601e

which includes the illumos#15143 bits:

[    6.552238] EXT4-fs (vda3): resizing filesystem from 472571 to 18298363 blocks
[    7.864090] EXT4-fs (vda3): resized filesystem to 18298363
[    9.598778] device-mapper: uevent: version 1.0.3
[    9.598892] device-mapper: ioctl: 4.43.0-ioctl (2020-10-01) initialised: dm-devel@redhat.com
[   58.242224] audit: type=1400 audit(1682651165.345:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=2681 comm="apparmor_parser"

 <<<< MIGRATION >>>>

[  183.116103] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
[  183.116131] clocksource:                       'acpi_pm' wd_now: da93f0 wd_last: bea500 mask: ffffff
[  183.116151] clocksource:                       'tsc' cs_now: 9f8e67f8bf cs_last: 9f23cc4b79 mask: ffffffffffffffff
[  183.116173] tsc: Marking TSC unstable due to clocksource watchdog
[  183.116486] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[  183.116488] sched_clock: Marking unstable (183118347082, -1856502)<-(183136894483, -20409854)
[  183.116700] clocksource: Checking clocksource tsc synchronization from CPU 1.
[  183.116756] clocksource: Switched to clocksource acpi_pm
[  385.392113] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  385.393282] rcu:     0-...!: (0 ticks this GP) idle=fdc/0/0x0 softirq=11204/11204 fqs=0  (false positive?)
[  385.394237]  (detected by 1, t=5252 jiffies, g=12417, q=82)
[  385.394239] Sending NMI from CPU 1 to CPUs 0:
[  385.394373] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  385.395243] rcu: rcu_sched kthread starved for 5252 jiffies! g12417 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  385.396074] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  385.396074] rcu: RCU grace-period kthread stack dump:
[  385.396074] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  385.396074] Call Trace:
[  385.396074]  __schedule+0x282/0x880
[  385.396074]  ? __switch_to_asm+0x3a/0x60
[  385.396074]  schedule+0x46/0xb0
[  385.396074]  schedule_timeout+0x8b/0x150
[  385.396074]  ? __next_timer_interrupt+0x110/0x110
[  385.396074]  rcu_gp_kthread+0x51b/0xbb0
[  385.396074]  ? rcu_cpu_kthread+0x190/0x190
[  385.396074]  kthread+0x11b/0x140
[  385.396074]  ? __kthread_bind_mask+0x60/0x60
[  385.396074]  ret_from_fork+0x22/0x30
[  448.412113] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  448.413389] rcu:     0-...!: (0 ticks this GP) idle=230/0/0x0 softirq=11204/11204 fqs=1  (false positive?)
[  448.415278]  (detected by 1, t=21007 jiffies, g=12417, q=124)
[  448.415280] Sending NMI from CPU 1 to CPUs 0:
[  448.415360] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  448.416074] rcu: rcu_sched kthread starved for 15754 jiffies! g12417 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  448.416074] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  448.416074] rcu: RCU grace-period kthread stack dump:
[  448.416074] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  448.416074] Call Trace:
[  448.416074]  __schedule+0x282/0x880
[  448.416074]  schedule+0x46/0xb0
[  448.416074]  schedule_timeout+0x8b/0x150
[  448.416074]  ? __next_timer_interrupt+0x110/0x110
[  448.416074]  rcu_gp_kthread+0x51b/0xbb0
[  448.416074]  ? rcu_cpu_kthread+0x190/0x190
[  448.416074]  kthread+0x11b/0x140
[  448.416074]  ? __kthread_bind_mask+0x60/0x60
[  448.416074]  ret_from_fork+0x22/0x30
[  469.424115] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  469.425395] rcu:     0-...!: (0 ticks this GP) idle=2f4/0/0x0 softirq=11204/11204 fqs=0  (false positive?)
[  469.427291]  (detected by 1, t=5252 jiffies, g=12421, q=160)
[  469.427295] Sending NMI from CPU 1 to CPUs 0:
[  469.427394] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  469.428075] rcu: rcu_sched kthread starved for 5252 jiffies! g12421 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  469.428075] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  469.428075] rcu: RCU grace-period kthread stack dump:
[  469.428075] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  469.428075] Call Trace:
[  469.428075]  __schedule+0x282/0x880
[  469.428075]  schedule+0x46/0xb0
[  469.428075]  schedule_timeout+0x8b/0x150
[  469.428075]  ? __next_timer_interrupt+0x110/0x110
[  469.428075]  rcu_gp_kthread+0x51b/0xbb0
[  469.428075]  ? rcu_cpu_kthread+0x190/0x190
[  469.428075]  kthread+0x11b/0x140
[  469.428075]  ? __kthread_bind_mask+0x60/0x60
[  469.428075]  ret_from_fork+0x22/0x30
[  532.444115] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  532.445401] rcu:     0-...!: (0 ticks this GP) idle=530/0/0x0 softirq=11204/11204 fqs=1  (false positive?)
[  532.446358]  (detected by 1, t=21007 jiffies, g=12421, q=435)
[  532.446360] Sending NMI from CPU 1 to CPUs 0:
[  532.446440] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  532.447365] rcu: rcu_sched kthread starved for 15753 jiffies! g12421 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  532.448075] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  532.448075] rcu: RCU grace-period kthread stack dump:
[  532.448075] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  532.448075] Call Trace:
[  532.448075]  __schedule+0x282/0x880
[  532.448075]  schedule+0x46/0xb0
[  532.448075]  schedule_timeout+0x8b/0x150
[  532.448075]  ? __next_timer_interrupt+0x110/0x110
[  532.448075]  rcu_gp_kthread+0x51b/0xbb0
[  532.448075]  ? rcu_cpu_kthread+0x190/0x190
[  532.448075]  kthread+0x11b/0x140
[  532.448075]  ? __kthread_bind_mask+0x60/0x60
[  532.448075]  ret_from_fork+0x22/0x30
[  553.456118] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  553.457413] rcu:     0-...!: (0 ticks this GP) idle=5fc/0/0x0 softirq=11204/11204 fqs=0  (false positive?)
[  553.460077]  (detected by 1, t=5252 jiffies, g=12425, q=473)
[  553.460077] Sending NMI from CPU 1 to CPUs 0:
[  553.460394] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  553.460663] rcu: rcu_sched kthread starved for 5252 jiffies! g12425 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  553.461483] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  553.461483] rcu: RCU grace-period kthread stack dump:
[  553.461483] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  553.461483] Call Trace:
[  553.461483]  __schedule+0x282/0x880
[  553.461483]  schedule+0x46/0xb0
[  553.461483]  schedule_timeout+0x8b/0x150
[  553.461483]  ? __next_timer_interrupt+0x110/0x110
[  553.461483]  rcu_gp_kthread+0x51b/0xbb0
[  553.461483]  ? rcu_cpu_kthread+0x190/0x190
[  553.461483]  kthread+0x11b/0x140
[  553.461483]  ? __kthread_bind_mask+0x60/0x60
[  553.461483]  ret_from_fork+0x22/0x30
[  616.476113] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  616.478385] rcu:     0-...!: (0 ticks this GP) idle=820/0/0x0 softirq=11204/11204 fqs=1  (false positive?)
[  616.480073]  (detected by 1, t=21007 jiffies, g=12425, q=540)
[  616.480073] Sending NMI from CPU 1 to CPUs 0:
[  616.480404] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  616.480604] rcu: rcu_sched kthread starved for 15753 jiffies! g12425 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  616.480604] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  616.480604] rcu: RCU grace-period kthread stack dump:
[  616.480604] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  616.480604] Call Trace:
[  616.480604]  __schedule+0x282/0x880
[  616.480604]  schedule+0x46/0xb0
[  616.480604]  schedule_timeout+0x8b/0x150
[  616.480604]  ? __next_timer_interrupt+0x110/0x110
[  616.480604]  rcu_gp_kthread+0x51b/0xbb0
[  616.480604]  ? rcu_cpu_kthread+0x190/0x190
[  616.480604]  kthread+0x11b/0x140
[  616.480604]  ? __kthread_bind_mask+0x60/0x60
[  616.480604]  ret_from_fork+0x22/0x30
[  637.492114] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  637.494354] rcu:     0-...!: (0 ticks this GP) idle=8e4/0/0x0 softirq=11204/11204 fqs=0  (false positive?)
[  637.496074]  (detected by 1, t=5252 jiffies, g=12429, q=528)
[  637.496074] Sending NMI from CPU 1 to CPUs 0:
[  637.498269] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  637.498646] rcu: rcu_sched kthread starved for 5252 jiffies! g12429 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  637.499229] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  637.499229] rcu: RCU grace-period kthread stack dump:
[  637.499229] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  637.499229] Call Trace:
[  637.499229]  __schedule+0x282/0x880
[  637.499229]  schedule+0x46/0xb0
[  637.499229]  schedule_timeout+0x8b/0x150
[  637.499229]  ? __next_timer_interrupt+0x110/0x110
[  637.499229]  rcu_gp_kthread+0x51b/0xbb0
[  637.499229]  ? rcu_cpu_kthread+0x190/0x190
[  637.499229]  kthread+0x11b/0x140
[  637.499229]  ? __kthread_bind_mask+0x60/0x60
[  637.499229]  ret_from_fork+0x22/0x30
[  700.512114] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  700.514386] rcu:     0-...!: (0 ticks this GP) idle=b00/0/0x0 softirq=11204/11204 fqs=1  (false positive?)
[  700.516074]  (detected by 1, t=21007 jiffies, g=12429, q=607)
[  700.516074] Sending NMI from CPU 1 to CPUs 0:
[  700.517048] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  700.517598] rcu: rcu_sched kthread starved for 15752 jiffies! g12429 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  700.517598] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  700.517598] rcu: RCU grace-period kthread stack dump:
[  700.517598] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  700.517598] Call Trace:
[  700.517598]  __schedule+0x282/0x880
[  700.517598]  schedule+0x46/0xb0
[  700.517598]  schedule_timeout+0x8b/0x150
[  700.517598]  ? __next_timer_interrupt+0x110/0x110
[  700.517598]  rcu_gp_kthread+0x51b/0xbb0
[  700.517598]  ? rcu_cpu_kthread+0x190/0x190
[  700.517598]  kthread+0x11b/0x140
[  700.517598]  ? __kthread_bind_mask+0x60/0x60
[  700.517598]  ret_from_fork+0x22/0x30
[  721.528113] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  721.530352] rcu:     0-...!: (0 ticks this GP) idle=bd4/0/0x0 softirq=11204/11204 fqs=0  (false positive?)
[  721.532075]  (detected by 1, t=5252 jiffies, g=12433, q=459)
[  721.532075] Sending NMI from CPU 1 to CPUs 0:
[  721.533057] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x20
[  721.533300] rcu: rcu_sched kthread starved for 5252 jiffies! g12433 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[  721.533300] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  721.533300] rcu: RCU grace-period kthread stack dump:
[  721.533300] task:rcu_sched       state:I stack:    0 pid:   13 ppid:     2 flags:0x00004000
[  721.533300] Call Trace:
[  721.533300]  __schedule+0x282/0x880
[  721.533300]  schedule+0x46/0xb0
[  721.533300]  schedule_timeout+0x8b/0x150
[  721.533300]  ? __next_timer_interrupt+0x110/0x110
[  721.533300]  rcu_gp_kthread+0x51b/0xbb0
[  721.533300]  ? rcu_cpu_kthread+0x190/0x190
[  721.533300]  kthread+0x11b/0x140
[  721.533300]  ? __kthread_bind_mask+0x60/0x60
[  721.533300]  ret_from_fork+0x22/0x30