Closed xen0n closed 7 months ago
Issue is likely related to KFENCE
it seems, rebuilding my 3A6000 kernel with KFENCE_SAMPLE_INTERVAL=0
(disable by default at run-time) to see if stability is restored.
I've got a backtrace on an A2101 board (3A5000 + 7A1000) at the first time the oops occurred:
[51136.914980] CPU 3 Unable to handle kernel paging request at virtual address ffff800002474000, era == 90000000021f2160, ra == 90000000021f2138
[51136.927629] Oops[#1]:
[51136.929882] CPU: 3 PID: 878 Comm: jbd2/nvme0n1p5- Tainted: G OE 7.6.9-gentoo-dist #1 2d35abbc4d75310e39af2a333fb880e2a8e5939a
[51136.942418] Hardware name: Loongson Loongson-3A5000-7A1000-1w-A2101/Loongson-LS3A5000-7A1000-1w-A2101, BIOS vUDK2018-LoongArch-V4.0.05132-beta10 12/13/202
[51136.956160] pc 90000000021f2160 ra 90000000021f2138 tp 90000001158b4000 sp 90000001001aba40
[51136.964460] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[51136.972763] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000
[51136.981071] t0 ffff800002473ffe t1 0000000000000000 t2 0000000000000000 t3 0000000002c00063
[51136.989380] t4 00000000000007ff t5 0000000000000000 t6 0000000000000000 t7 0000000000000000
[51136.997682] t8 0000000000000000 u0 0000000000000000 s9 90000001001abca0 s0 9000000003b82990
[51137.005990] s1 9000000003190f60 s2 90000000035b0000 s3 90000001001abac0 s4 90000000035b0000
[51137.014296] s5 9000000003b60000 s6 90000001001abab8 s7 90000001002aba50 s8 ffff8000024c4228
[51137.022597] ra: 90000000021f2138 unwind_next_frame+0xd8/0x740
[51137.028569] ERA: 90000000021f2160 unwind_next_frame+0x100/0x740
[51137.034624] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[51137.040774] PRMD: 00000000 (PPLV0 -PIE -PWE)
[51137.045101] EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
[51137.049861] ECFG: 00071814 (LIE=2,4,11-12 VS=7)
[51137.054448] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
[51137.059897] BADV: ffff800002474000
[51137.063357] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
[51137.069064] Modules linked in: la_ow_syscall(OE) snd_seq_dummy snd_hrtimer snd_seq snd_seq_device joydev mousedev hid_multitouch usbhid ch341 tun amdgpu amdxcp drm_exec mfd_core gpu_sched drm_buddy drm_suballoc_helper drm_ttm_helper drm_display_helper cec rc_core spi_loongson_pci spi_loongson_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec ipmi_ssif snd_hda_core snd_hwdep acpi_ipmi snd_pcm ipmi_si gpio_loongson_64bit snd_timer gpio_generic i2c_ls2x ipmi_devintf rtc_loongson snd loongson ipmi_msghandler soundcore ttm nls_cp936 vfat fat evdev wireguard libchacha20poly1305 libcurve25519_generic libchacha libpoly1305 cfg80211 rfkill sch_fq_codel loop fuse efi_pstore pstore nfnetlink ext4 mbcache jbd2 nvme nvme_core nvme_common r8169 dwmac_loongson stmmac pcs_xpcs xhci_pci xhci_pci_renesas phylink btrfs xor raid6_pq zlib_deflate dm_mirror dm_region_hash dm_log dm_mod dax pkcs8_key_parser efivarfs
[51137.154539] Process jbd2/nvme0n1p5- (pid: 878, threadinfo=000000003ad9e375, task=00000000e3c0a5e8)
[51137.163444] Stack : 9000000009807d00 0000000000052dba 00000000000529c2 57a60d166999aba8
[51137.171412] 0000000000003000 0000000000000000 90000000022dd364 9000000115a33e40
[51137.179374] 90000001001abb08 90000000035b0000 90000001001abca0 90000000022dd180
[51137.187335] 90000001001abab8 90000000021ef134 00000000000001af 0000000000000001
[51137.195297] 0000000000000002 90000001158b4000 90000001158b8000 0000000000000000
[51137.203258] 9000000115a33e40 0000000000000001 90000001158b7b40 ffff8000024c4228
[51137.211218] ffff8000024c6778 0000000000000000 0000000000000000 0000000000000000
[51137.219179] 90000001001abca0 0000000000000000 0000000000000000 0000000000000000
[51137.227142] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[51137.235100] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[51137.243063] ...
[51137.245487] Call Trace:
[51137.245488] [<90000000021f2160>] unwind_next_frame+0x100/0x740
[51137.253704] [<90000000021ef134>] arch_stack_walk+0xd4/0x1a0
[51137.259239] [<90000000022dd364>] stack_trace_save+0x44/0xa0
[51137.264774] [<90000000024e25d0>] metadata_update_state+0xf0/0x140
[51137.270828] [<90000000024e39c4>] kfence_guarded_free+0x124/0x3a0
[51137.276795] [<ffff8000025ac7e4>] ext4_end_bio+0x44/0x1c0 [ext4]
[51137.282721] [<9000000002745e64>] blk_mq_end_request_batch+0x3e4/0x720
[51137.289123] [<ffff800002462e10>] nvme_irq+0x90/0xae0 [nvme]
[51137.294664] [<90000000022a02d0>] __handle_irq_event_percpu+0x50/0x160
[51137.301065] [<90000000022a04a4>] handle_irq_event+0x44/0x100
[51137.306687] [<90000000022a8138>] handle_edge_irq+0xf8/0x3a0
[51137.312221] [<900000000229eba8>] generic_handle_domain_irq+0x28/0x60
[51137.318534] [<900000000289c608>] eiointc_irq_dispatch+0xa8/0x1c0
[51137.324502] [<900000000229eba8>] generic_handle_domain_irq+0x28/0x60
[51137.330814] [<900000000289b7b8>] handle_cpu_irq+0x78/0xc0
[51137.336177] [<90000000030d9730>] handle_loongarch_irq+0x30/0x60
[51137.342057] [<90000000030d97ec>] do_vint+0x8c/0x100
[51137.346901] [<ffff8000024c4228>] __kstrtabns_jbd2_journal_put_journal_head+0x529c2/0x52dba [jbd2]
[51137.355737] CPU 3 Unable to handle kernel paging request at virtual address ffff800002474000, era == 90000000021f2160, ra == 90000000021f2138
[51160.505705] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[51160.511606] rcu: 3-...0: (9 ticks this GP) idle=b574/1/0x4000000000000000 softirq=1625364/1625366 fqs=2877
[51160.521291] rcu: (detected by 1, t=21020 jiffies, g=2940689, q=838 ncpus=4)
[51160.528294] Sending NMI from CPU 1 to CPUs 3:
[51170.532823] rcu: rcu_sched kthread starved for 16905 jiffies! g2940689 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[51170.543110] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[51170.552012] rcu: RCU grace-period kthread stack dump:
[51170.557027] task:rcu_sched state:R running task stack:0 pid:16 ppid:2 flags:0x00000800
[51170.566886] Stack : 0000000000000000 00002e817b633bc2 90000001002eade8 90000000030e54c8
[51170.574850] 0000000000000003 9000000003183d00 9000000003698c90 0000000000000000
[51170.582812] 9000000003a23a98 90000000035b0000 90000000035b0000 000000010307f74a
[51170.590776] 0000000000000001 0000000000000000 9000000003191e20 0000000000000000
[51170.598740] 0000000000000002 57a60d166999aba8 000000010307f74b 57a60d166999aba8
[51170.606700] 9000000003698c90 0000000000000000 9000000003a23a98 90000001003ebd90
[51170.614664] 90000000035a6000 90000001003ebd18 90000000035b0000 90000001002ea780
[51170.622625] 0000000000000003 90000000030e54c8 000000010307f74a 90000000030ed234
[51170.630590] 9000000003a23dc0 0000000000000122 0000000000000000 000000010307f74a
[51170.638549] 90000000022df940 0000000002c00001 90000001002ea780 57a60d166999aba8
[51170.646510] ...
[51170.648934] Call Trace:
[51170.648935] [<90000000030e48f0>] __schedule+0xab0/0x1620
[51170.656637] [<90000000030e54c8>] schedule+0x68/0xe0
[51170.661482] [<90000000030ed234>] schedule_timeout+0x94/0x160
[51170.667104] [<90000000022c16d4>] rcu_gp_fqs_loop+0x114/0x5c0
[51170.672727] [<90000000022c3a64>] rcu_gp_kthread+0x164/0x1a0
[51170.678262] [<900000000223b320>] kthread+0x100/0x120
[51170.683194] [<90000000021e15e8>] ret_from_kernel_thread+0xc/0xa4
[51170.689160]
[51170.690631] rcu: Stack dump where RCU GP kthread last ran:
[51170.696077] Sending NMI from CPU 1 to CPUs 0:
[51170.700402] NMI backtrace for cpu 0
[51170.703869] CPU: 0 PID: 1716 Comm: node_exporter Tainted: G OE 6.6.9-gentoo-dist #1 2d35abbc4d75310e39af2a333fb880e2a8e5939a
[51170.716318] Hardware name: Loongson Loongson-3A5000-7A1000-1w-A2101/Loongson-LS3A5000-7A1000-1w-A2101, BIOS vUDK2018-LoongArch-V4.0.05132-beta10 12/13/202
[51170.730059] pc 9000000002301d20 ra 9000000002301eb0 tp 900000011fe90000 sp 900000011fe93a50
[51170.738363] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[51170.746664] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000
[51170.754966] t0 0000000000000001 t1 900000000980db40 t2 0000000000000000 t3 0000000000000003
[51170.763272] t4 90000000035aff58 t5 0000000000000001 t6 0000000000000040 t7 0000000000000000
[51170.771572] t8 0000000000000000 u0 0000000000000001 s9 9000000008008d40 s0 00000000000000b0
[51170.779878] s1 900000011fe93b00 s2 900000011fe93c28 s3 900000011fe93c50 s4 000000c0004c4000
[51170.788182] s5 0000000000000000 s6 90000000035b0000 s7 9000000110755e80 s8 000000c000800000
[51170.796488] ra: 9000000002301eb0 smp_call_function_many_cond+0x2d0/0x440
[51170.803407] ERA: 9000000002301d20 smp_call_function_many_cond+0x140/0x440
[51170.810325] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[51170.816473] PRMD: 00000004 (PPLV0 +PIE -PWE)
[51170.820799] EUEN: 00000001 (+FPE -SXE -ASXE -BTE)
[51170.825558] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[51170.830317] ESTAT: 00001000 [INT] (IS=12 ECode=0 EsubCode=0)
[51170.835939] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
[51170.841645] CPU: 0 PID: 1716 Comm: node_exporter Tainted: G OE 6.6.9-gentoo-dist #1 2d35abbc4d75310e39af2a333fb880e2a8e5939a
[51170.854091] Hardware name: Loongson Loongson-3A5000-7A1000-1w-A2101/Loongson-LS3A5000-7A1000-1w-A2101, BIOS vUDK2018-LoongArch-V4.0.05132-beta10 12/13/202
[51170.867832] Stack : 0000000000000004 0000000000000000 90000000021e39a4 900000011fe90000
[51170.875789] 900000010019fcb0 900000010019fcb8 0000000000000000 900000010019fdf8
[51170.883753] 900000010019fdf0 900000010019fdf0 0000000000000000 0000000000000000
[51170.891714] 0000000000000000 900000010019fcb8 57a60d166999aba8 0000000000000000
[51170.899672] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[51170.907634] 0000000000000000 0000000000000000 0000000004e84000 9000000008008d40
[51170.915597] 0000000000000000 0000000000000000 9000000003424298 90000000035b0000
[51170.923559] 90000000035b8668 900000011fe93910 0000000000000000 0000000000000000
[51170.931518] 000000c000800000 0000000000000000 90000000021e39c4 000000c000508000
[51170.939480] 00000000000000b0 0000000000000004 0000000000000001 0000000000071c1d
[51170.947442] ...
[51170.949865] Call Trace:
[51170.949866] [<90000000021e39c4>] show_stack+0x64/0x1c0
[51170.957389] [<90000000030d9308>] dump_stack_lvl+0x78/0xb0
[51170.962751] [<90000000030a9948>] nmi_cpu_backtrace+0x188/0x1a0
[51170.968547] [<90000000021e3fd0>] handle_backtrace+0x10/0x60
[51170.974081] [<90000000023016cc>] __flush_smp_call_function_queue+0x10c/0x360
[51170.981083] [<90000000021f0454>] loongson_ipi_interrupt+0x94/0x100
[51170.987225] [<90000000022a02d0>] __handle_irq_event_percpu+0x50/0x160
[51170.993626] [<90000000022a03f8>] handle_irq_event_percpu+0x18/0x80
[51170.999765] [<90000000022a9098>] handle_percpu_irq+0x58/0xc0
[51171.005385] [<900000000229eba8>] generic_handle_domain_irq+0x28/0x60
[51171.011697] [<900000000289b7b8>] handle_cpu_irq+0x78/0xc0
[51171.017062] [<90000000030d9730>] handle_loongarch_irq+0x30/0x60
[51171.022942] [<90000000030d97ec>] do_vint+0x8c/0x100
[51171.027785] [<9000000002301d20>] smp_call_function_many_cond+0x140/0x440
[51171.034443] [<90000000023020bc>] on_each_cpu_cond_mask+0x1c/0x40
[51171.040408] [<90000000021f0e78>] flush_tlb_range+0x78/0x180
[51171.045943] [<9000000002478038>] tlb_flush+0x58/0xc0
[51171.050872] [<90000000024786c8>] tlb_finish_mmu+0xe8/0x160
[51171.056319] [<90000000024639b8>] zap_page_range_single+0x138/0x240
[51171.062460] [<90000000024a1ea4>] madvise_vma_behavior+0x624/0xac0
[51171.068514] [<900000000249f438>] madvise_walk_vmas+0xb8/0x1e0
[51171.074221] [<90000000024a2588>] do_madvise+0x148/0x200
[51171.079410] [<90000000024a2980>] sys_madvise+0x20/0x40
Hmm, "percpu" reminds me the extreme code model and could it be the notorious extreme code model issue?
Is it time to add some validation for extreme relocations in the kernel?
Hmm, "percpu" reminds me the extreme code model and could it be the notorious extreme code model issue?
Perhaps no. My 3A6000 machine has been working perfectly for 2 days after I booted with kfence.sample_interval=0
in the kernel cmdline. Previously it would barely survive for 1hr so it's 99% the kfence implementation's fault.
Hmm, "percpu" reminds me the extreme code model and could it be the notorious extreme code model issue?
Perhaps no. My 3A6000 machine has been working perfectly for 2 days after I booted with
kfence.sample_interval=0
in the kernel cmdline. Previously it would barely survive for 1hr so it's 99% the kfence implementation's fault.
I mean maybe kfence is using some per-cpu variable and then blow up?
Hmm, "percpu" reminds me the extreme code model and could it be the notorious extreme code model issue?
Perhaps no. My 3A6000 machine has been working perfectly for 2 days after I booted with
kfence.sample_interval=0
in the kernel cmdline. Previously it would barely survive for 1hr so it's 99% the kfence implementation's fault.I mean maybe kfence is using some per-cpu variable and then blow up?
Hmm then I'll have to check the asm later...
Issue is likely related to
KFENCE
it seems, rebuilding my 3A6000 kernel withKFENCE_SAMPLE_INTERVAL=0
(disable by default at run-time) to see if stability is restored.
Guenter Roeck identified some issues caused by KFENCE: https://lore.kernel.org/loongarch/c352829b-ed75-4ffd-af6e-0ea754e1bf3d@roeck-us.net/
Not sure if it's exactly the same issue though.
How to reproduce quickly:
Kconfig:
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=1
CONFIG_KFENCE_NUM_OBJECTS=65535
WARNING: The following steps will result in data loss.
while true; do
blkdiscard -f /dev/nvme0n1p1 2>/dev/null
done
Resolved in Linux v6.9-rc4.
Pending investigation.