Closed andrew64k closed 2 months ago
Also reported here https://forum.opnsense.org/index.php?topic=41757.msg205214#msg205214
Problem actually appears to be xen(4) only.
Could be related to 979bb7ac144
Kernel to try:
# opnsense-update -zkr 24.7_7
Don't forget to reboot.
Cheers, Franco
Still crashes after 24.7_7 kernel update.
Similarly, even if you update to _7, you will still get a kernel panic with HVM on XCP-ng.
Stack trace please to confirm. I'm relatively certain this is a FreeBSD issue.
Dump header from device: /dev/ada1s1
Architecture: amd64
Architecture Version: 4
Dump Length: 72704
Blocksize: 512
Compression: none
Dumptime: 2024-07-29 09:27:55 -0400
Hostname: OPNsense-Test1.localdomain
Magic: FreeBSD Text Dump
Version String: FreeBSD 14.1-RELEASE-p2 stable/24.7-n267765-b269437501d8 SMP
Panic String: page fault
Dump Parity: 4225216080
Bounds: 3
Dump Status: good
593.140339 [1167] generic_netmap_attach Emulated adapter for xn0 created (prev was NULL)
593.149593 [1072] generic_netmap_dtor Emulated netmap adapter for xn0 destroyed
593.158630 [1167] generic_netmap_attach Emulated adapter for xn0 created (prev was NULL)
593.167763 [1072] generic_netmap_dtor Emulated netmap adapter for xn0 destroyed
593.176779 [1167] generic_netmap_attach Emulated adapter for xn0 created (prev was NULL)
593.195082 [1072] generic_netmap_dtor Emulated netmap adapter for xn0 destroyed
593.206795 [1167] generic_netmap_attach Emulated adapter for xn0 created (prev was NULL)
593.373322 [ 319] generic_netmap_register Emulated adapter for xn0 activated
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x30
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80a0f08f
stack pointer = 0x28:0xfffffe007a8db8e0
frame pointer = 0x28:0xfffffe007a8db970
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 34371 (W#01-xn0^)
rdi: fffff80001fce000 rsi: fffff80001f02e00 rdx: fffff80001f02e00
rcx: fffff8000154b000 r8: 00000000000000e6 r9: 0000000000000800
rax: 00000000000000ff rbx: fffffe0067fab000 rbp: fffffe007a8db970
r10: 0000000000000301 r11: fffff800687f9520 r12: 0000000000000000
r13: fffff800015c3000 r14: fffffe007a8db944 r15: fffff80001f02e00
trap number = 12
panic: page fault
cpuid = 1
time = 1722259675
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe007a8db5d0
vpanic() at vpanic+0x131/frame 0xfffffe007a8db700
panic() at panic+0x43/frame 0xfffffe007a8db760
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe007a8db7c0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe007a8db810
calltrap() at calltrap+0x8/frame 0xfffffe007a8db810
--- trap 0xc, rip = 0xffffffff80a0f08f, rsp = 0xfffffe007a8db8e0, rbp = 0xfffffe007a8db970 ---
xn_txq_mq_start_locked() at xn_txq_mq_start_locked+0xdf/frame 0xfffffe007a8db970
xn_txq_mq_start() at xn_txq_mq_start+0x76/frame 0xfffffe007a8db9a0
nm_os_generic_xmit_frame() at nm_os_generic_xmit_frame+0xa0/frame 0xfffffe007a8db9f0
generic_netmap_txsync() at generic_netmap_txsync+0x3a2/frame 0xfffffe007a8dbae0
netmap_ioctl() at netmap_ioctl+0x1a7/frame 0xfffffe007a8dbbb0
freebsd_netmap_ioctl() at freebsd_netmap_ioctl+0x79/frame 0xfffffe007a8dbbf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe007a8dbc40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe007a8dbcb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe007a8dbcd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe007a8dbd40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe007a8dbe00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe007a8dbf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe007a8dbf30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x82e0505fa, rsp = 0x8320d2df8, rbp = 0x8320d2e20 ---
KDB: enter: panic
Test patch 3f1850fd8 via @markjdb
Test kernel installs as follows:
REDACTED
Feedback is highly appreciated :)
Cheers, Franco
Update installed. Reboot. Way worse.... as soon as it finishes booting it crashed. I can't even login. It does not even have time to do a crash dump.
Thanks. Revoked kernel for now and will pass this along.
617c782a35a was missing from previous, new kernel here:
REDACTED
Cheers, Franco
24.7-xen2 Installed, rebooted. It's working for me. No crashes during an hour of testing.
Thanks!
One more kernel, now following a simpler final commit already in FreeBSD https://cgit.freebsd.org/src/commit/?id=2e4781cb12a
# opnsense-update -zkr 24.7-xen3
Unless there is bad feedback about this 3rd iteration I'll ship this particular fix in 24.7.1 and close the issue.
Cheers, Franco
Updated. Reboot. Ran traffic for an hour and still working.
I did opnsense-update -zkr 24.7-xen3. It's already 3 hours old and it's working with zenarmor and suricata without restarting.
@A2sti awesome, thanks!
No kernel panic occurs with 24.7-xen3 Thank you
@kotashiratsuka also in 24.7.1 now :)
Describe the bug Kernel crash with OPNsense 24.7_5 running on XCP 8.2.1 (Xen) Seems to work until AFTER the
generic_netmap_
messages.To Reproduce Boot OPNsense, let it start. Wait for after the generic_netmap messages Access the management web page (and some sub pages) crash...reboot...repeat...
Expected behavior It does not crash...
Screenshots
Relevant log files
Crash dump file: textdump.tar.gz
Additional context Thanks for the hard work! Looks like a FreeBSD problem.
Environment Software version used and hardware type if relevant, e.g.: Working normally with OPNsense 24.1.10_8 Crashes with OPNsense 24.7_5
Running as HVM on XCP 8.2.1 (Xen 4.13.5) Intel E5-2680 v2 Xen Virtual Net