opnsense / src

OPNsense operating system on top of FreeBSD
https://opnsense.org/
Other
356 stars 151 forks source link

24.7 - Kernel panic when pinging interface IPv6 address #207

Open belotv opened 3 months ago

belotv commented 3 months ago

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

When pinging the IPv6 address of any interface (WAN, LAN, Wireguard), the kernel panics and router reboot. I restored the same configuration file on 24.1 and no issue with 24.1.

To Reproduce

Steps to reproduce the behavior:

  1. Go to Overview
  2. Check the IPv6 address of the LAN interface
  3. From a computer on the LAN, ping the interface
  4. Router reboots

Expected behavior

Router shall respond to ping and not crash

Describe alternatives you considered

N/A

Screenshots

N/A

Relevant log files

Have been communicated through the crash reporter.

Additional context

I am using Intel I226 LAN (igc) with VLAN enabled.

Environment

OPNsense 24.7 Beta (amd64) Pinged from Windows 11 client.

fichtner commented 3 months ago
ddb.txt06000014000014635311227  7076 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:1:lockinfo>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 3
dynamic pcpu = 0xfffffe00b9e6fc40
curthread    = 0xfffff80001abe000: pid 0 tid 100018 critnest 1 "if_io_tqg_3"
curpcb       = 0xfffff80001abe520
fpcurthread  = none
idlethread   = 0xfffff80001aef000: tid 100006 "idle: cpu3"
self         = 0xffffffff82c13000
curpmap      = 0xffffffff81b81670
tssp         = 0xffffffff82c13384
rsp0         = 0xfffffe0038bd7000
kcr3         = 0x80000000313d1002
ucr3         = 0xffffffffffffffff
scr3         = 0x3efad7c4f
gs32p        = 0xffffffff82c13404
ldt          = 0xffffffff82c13444
tss          = 0xffffffff82c13434
curvnet      = 0xfffff800012a5980
db:0:kdb.enter.default>  bt
Tracing pid 0 tid 100018 td 0xfffff80001abe000
kdb_enter() at kdb_enter+0x33/frame 0xfffffe0038bd6660
panic() at panic+0x43/frame 0xfffffe0038bd66c0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0038bd6720
trap_pfault() at trap_pfault+0x46/frame 0xfffffe0038bd6770
calltrap() at calltrap+0x8/frame 0xfffffe0038bd6770
--- trap 0xc, rip = 0xffffffff80ddb5d7, rsp = 0xfffffe0038bd6840, rbp = 0xfffffe0038bd6970 ---
ip6_forward() at ip6_forward+0x2a7/frame 0xfffffe0038bd6970
ip6_input() at ip6_input+0x11f/frame 0xfffffe0038bd6a50
netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe0038bd6aa0
ether_demux() at ether_demux+0x149/frame 0xfffffe0038bd6ad0
ether_nh_input() at ether_nh_input+0x36a/frame 0xfffffe0038bd6b30
netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe0038bd6b80
ether_input() at ether_input+0x56/frame 0xfffffe0038bd6bd0
ether_demux() at ether_demux+0x97/frame 0xfffffe0038bd6c00
ether_nh_input() at ether_nh_input+0x36a/frame 0xfffffe0038bd6c60
netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe0038bd6cb0
ether_input() at ether_input+0x56/frame 0xfffffe0038bd6d00
iflib_rxeof() at iflib_rxeof+0xc0e/frame 0xfffffe0038bd6e00
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe0038bd6e40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x14e/frame 0xfffffe0038bd6ec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfffffe0038bd6ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe0038bd6f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0038bd6f30
--- trap 0x9d9d3e64, rip = 0xb414471ee4b2474b, rsp = 0x3113c21961b5c24c, rbp = 0x5647a54d06e1a518 ---
fichtner commented 3 months ago

From a different user, same same but different:

ddb.txt06000014000014633270447  7104 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:1:lockinfo>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 3
dynamic pcpu = 0xfffffe00b6a73c40
curthread    = 0xfffff80001a30000: pid 12 tid 100044 critnest 1 "swi1: netisr 3"
curpcb       = 0xfffff80001a30520
fpcurthread  = none
idlethread   = 0xfffff80001a7d000: tid 100006 "idle: cpu3"
self         = 0xffffffff82c13000
curpmap      = 0xffffffff81b81670
tssp         = 0xffffffff82c13384
rsp0         = 0xfffffe00dfe64000
kcr3         = 0xffffffffffffffff
ucr3         = 0xffffffffffffffff
scr3         = 0x0
gs32p        = 0xffffffff82c13404
ldt          = 0xffffffff82c13444
tss          = 0xffffffff82c13434
curvnet      = 0xfffff80001268980
db:0:kdb.enter.default>  bt
Tracing pid 12 tid 100044 td 0xfffff80001a30000
kdb_enter() at kdb_enter+0x33/frame 0xfffffe00dfe63ac0
panic() at panic+0x43/frame 0xfffffe00dfe63b20
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00dfe63b80
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00dfe63bd0
calltrap() at calltrap+0x8/frame 0xfffffe00dfe63bd0
--- trap 0xc, rip = 0xffffffff80ddaee4, rsp = 0xfffffe00dfe63ca0, rbp = 0xfffffe00dfe63d10 ---
ip6_tryforward() at ip6_tryforward+0x264/frame 0xfffffe00dfe63d10
ip6_input() at ip6_input+0x537/frame 0xfffffe00dfe63df0
swi_net() at swi_net+0x138/frame 0xfffffe00dfe63e60
ithread_loop() at ithread_loop+0x257/frame 0xfffffe00dfe63ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe00dfe63f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00dfe63f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
fichtner commented 3 months ago

And the last one:

ddb.txt06000014000014632525416  7102 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:1:lockinfo>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 2
dynamic pcpu = 0xfffffe00b6a64c40
curthread    = 0xfffff800b9f04000: pid 8092 tid 100997 critnest 1 "AdGuardHome"
curpcb       = 0xfffff800b9f04520
fpcurthread  = 0xfffff800b9f04000: pid 8092 "AdGuardHome"
idlethread   = 0xfffff80001a7d740: tid 100005 "idle: cpu2"
self         = 0xffffffff82c12000
curpmap      = 0xfffff8001a8acd38
tssp         = 0xffffffff82c12384
rsp0         = 0xfffffe0123531000
kcr3         = 0xffffffffffffffff
ucr3         = 0xffffffffffffffff
scr3         = 0x0
gs32p        = 0xffffffff82c12404
ldt          = 0xffffffff82c12444
tss          = 0xffffffff82c12434
curvnet      = 0xfffff80001268980
db:0:kdb.enter.default>  bt
Tracing pid 8092 tid 100997 td 0xfffff800b9f04000
kdb_enter() at kdb_enter+0x33/frame 0xfffffe01235308d0
panic() at panic+0x43/frame 0xfffffe0123530930
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0123530990
trap_pfault() at trap_pfault+0x46/frame 0xfffffe01235309e0
calltrap() at calltrap+0x8/frame 0xfffffe01235309e0
--- trap 0xc, rip = 0xffffffff80dd993a, rsp = 0xfffffe0123530ab0, rbp = 0xfffffe0123530be0 ---
in6_selectsrc() at in6_selectsrc+0x65a/frame 0xfffffe0123530be0
in6_selectsrc_socket() at in6_selectsrc_socket+0x41/frame 0xfffffe0123530c20
in6_pcbconnect() at in6_pcbconnect+0x172/frame 0xfffffe0123530ca0
udp6_connect() at udp6_connect+0x2d4/frame 0xfffffe0123530d20
soconnectat() at soconnectat+0xb1/frame 0xfffffe0123530d60
kern_connectat() at kern_connectat+0xe3/frame 0xfffffe0123530dc0
sys_connect() at sys_connect+0x81/frame 0xfffffe0123530e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe0123530f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0123530f30
--- syscall (98, FreeBSD ELF64, connect), rip = 0x4906bf, rsp = 0x86c130d58, rbp = 0x86c130d58 ---
AdSchellevis commented 3 months ago

@fichtner not sure if it's the same thing, but the ip6_forward() looks suspicious when this traffic was intended for the host itself. maybe @belotv would be so kind to add some context to the ticket, like features that are being used and if any ipv6 forwarding or NAT are in play here. A (censored) ifconfig is also helpful. Just adding an ipv6 address to an interface and pinging it isn't enough to reproduce this (at least not from my end).

fichtner commented 3 months ago

I collected all submitted panics for the scope of this ticket. One of this might be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279653 https://reviews.freebsd.org/D45690

belotv commented 3 months ago

I am travelling today but will provide the ifconfig tomorrow. I reverted back to 24.1 for the moment. I am using wireguard tunnel with ipv4+ipv6, so indeed I have outbound nat. I am using the configuration from https://docs.opnsense.org/manual/how-tos/wireguard-selective-routing.html and exposing the wireguard connection on VLAN 2. (I tried with NPTv6 too and have the same issue on 24.7 but loading the very same config file on 24.1 works fine). FYI, my wireguard endpoint is an IPv6.

I also have traffic shaper enabled (queues/pipes).

fichtner commented 2 months ago

First two: ip6_input() has no significant changes and since both panics lead to ip6_forward() AND ip6_tryforward() it's suspicious as both called functions can't really crash for the same reason in separate code paths unless there was a problem in the path of ip6_input() before from stock FreeBSD. The commit f257b8d7 isn't on stable/14 either which makes it a prime candidate and will be included in RC2.

Third one: caused by adguardhome, which we don't provide and no idea what FreeBSD version it was built against (not 14.1 judging by the timing). No changes in FreeBSD in that area either so wait and see.

fichtner commented 2 months ago

The ip6_input() one is still happening. I'll issue a debug kernel in RC2 to get to the bottom of it.

fichtner commented 2 months ago

ip6_forward() commit 9cb6d71f6a41d but still unclear about the ip6_tryforward()

fichtner commented 2 months ago

ip6_tryforward() seems to be buggy upstream, because https://redmine.pfsense.org/issues/15640