Closed lelemka0 closed 7 months ago
Hi,
Looks like an issue in the vtnet(1) driver during packet receive. Not much else can be said here for now. Perhaps https://bugs.freebsd.org has some clues about this...
Cheers, Franco
This problem keeps coming up and I've been looking at and trying it out more over the past while. I found that this problem only occurred when using two wan ports, and it ran stably for a long time no matter which one I turned off.
Before each panic, there is a large amount of repeated similar content in the log, as follows:
<7>cannot forward src fe80:3::1, dst <my vtnet2 port's expired ipv6 address>, nxt 58, rcvif vtnet2, outif pppoe0
Among them, vtnet2
and pppoe0
are two wan ports respectively.
I tried capturing packets on the vtnet2 interface, but I never found the packet with the source address fe80:3::1
. According to nxt 58
and time, I think this packet is icmpv6 type135
, Neibhbor Solicitation
from ISP device with
local-link address fe80::1
, the requested destination address is the last expired ipv6 address on vtnet2.
It appears that since the address does not exist on the router, the request is forwarded to the default gateway and fails.
I see that the firewall rules allow ICMPv6 Type 135,136
from and to all interfaces by default, but I can't do anything about it.
This may not be a driver issue, could you take a look at it for me, thanks very much.
Latest log:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0x10
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80ea3b9c
stack pointer = 0x28:0xfffffe008fa183f0
frame pointer = 0x28:0xfffffe008fa18510
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (irq30: virtio_pci2)
trap number = 12
panic: page fault
cpuid = 1
time = 1694263139
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008fa181b0
vpanic() at vpanic+0x151/frame 0xfffffe008fa18200
panic() at panic+0x43/frame 0xfffffe008fa18260
trap_fatal() at trap_fatal+0x387/frame 0xfffffe008fa182c0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe008fa18320
calltrap() at calltrap+0x8/frame 0xfffffe008fa18320
--- trap 0xc, rip = 0xffffffff80ea3b9c, rsp = 0xfffffe008fa183f0, rbp = 0xfffffe008fa18510 ---
ip6_forward() at ip6_forward+0x60c/frame 0xfffffe008fa18510
pf_refragment6() at pf_refragment6+0x14f/frame 0xfffffe008fa18560
pf_test6() at pf_test6+0xfdf/frame 0xfffffe008fa186d0
pf_check6_out() at pf_check6_out+0x40/frame 0xfffffe008fa18700
pfil_run_hooks() at pfil_run_hooks+0x97/frame 0xfffffe008fa18740
ip6_tryforward() at ip6_tryforward+0x2ce/frame 0xfffffe008fa187c0
ip6_input() at ip6_input+0x5e4/frame 0xfffffe008fa188a0
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe008fa188f0
ether_demux() at ether_demux+0x159/frame 0xfffffe008fa18920
ng_ether_rcv_upper() at ng_ether_rcv_upper+0x8c/frame 0xfffffe008fa18940
ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe008fa189d0
ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe008fa18a10
ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe008fa18aa0
ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe008fa18ae0
ng_ether_input() at ng_ether_input+0x4c/frame 0xfffffe008fa18b10
ether_nh_input() at ether_nh_input+0x1f2/frame 0xfffffe008fa18b70
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe008fa18bc0
ether_input() at ether_input+0x69/frame 0xfffffe008fa18c20
ether_demux() at ether_demux+0xa0/frame 0xfffffe008fa18c50
ether_nh_input() at ether_nh_input+0x36b/frame 0xfffffe008fa18cb0
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe008fa18d00
ether_input() at ether_input+0x69/frame 0xfffffe008fa18d60
vtnet_rxq_eof() at vtnet_rxq_eof+0x80/frame 0xfffffe008fa18e20
vtnet_rx_vq_process() at vtnet_rx_vq_process+0xb7/frame 0xfffffe008fa18e60
ithread_loop() at ithread_loop+0x25a/frame 0xfffffe008fa18ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe008fa18f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008fa18f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Uptime: 4h6m29s
---<<BOOT>>---
Version:
OPNsense 23.7.3-amd64
FreeBSD 13.2-RELEASE-p2
OpenSSL 1.1.1v 1 Aug 2023
Maybe it’s the same problem as #184
Different bug. I have the suspicion this is something inherently broken in FreeBSD 13 and nobody bothers to fix it there. We tried to apply a bandaid but it may not be working for everyone, see https://github.com/opnsense/src/commit/8bf1ae0b1e5987fc07743928cf3aa0d501439d37
I meant to post https://github.com/opnsense/src/commit/fe901c3661ea71f6aa688098184e07fd3a0d85bd but looking at it for you ip6_forward fails which what was fixed because the new path is ip6_output. Strange.
No response after posting a probable fix...
In the latest version (OPNsense 24.1.4-amd64, FreeBSD 13.2-RELEASE-p10, OpenSSL 3.0.13), this situation still occurs from time to time. I have observed that the occurrence of panic seems to have a regular time interval, with an interval of about 4 hours, and is not affected by active restarts. Panic always appears around these time points: 0:10, 4:10, 8:10, 12:10, 16:10, 20:10.
I'm not sure why this is, but as a temporary solution, I turned off ipv6 completely (configure ipv6 on all interfaces to None
) then the panic no longer occurs.
I'm not sure why this is, but as a temporary solution, I turned off ipv6 completely (configure ipv6 on all interfaces to
None
) then the panic no longer occurs.
Having a very similar problem to you, how specifically did you turn ipv6 off completely?
I'm relatively sure the problem is gone from 24.7, if not 24.1.x too.
I'm relatively sure the problem is gone from 24.7, if not 24.1.x too.
I just updated my OPNSense installation last night and it just crashed on me an hour ago. So for me the problem is definitely not fixed. If you want to look through my crash reports, I've been sending them in immediately after I know about them. Email is Fgtfv567@gmail.com. Any insight into my problem would be welcomed.
how specifically did you turn ipv6 off completely?
Set the ipv6 configuration type of all interfaces to None
.
Actually, for me, opnsense regularly crashes were caused by dnscrypt-proxy running latency tests for ipv6 upstream server.
I realized this by chance. Since I disabled ipv6 dns server in dnscrypt-proxy, kernel panic never happened again, so I believe the root cause lies in large and frequent ipv6 icmp.
This problem exists in 24.1 indeed, but I have not tried it in 24.7.
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
Describe the bug
I get kernel panics from time to time, I'm not sure what's causing this problem, it's been happening since I upgraded opnsense from
23.1.7_3
to23.1.9
. 2-3 times a day, after which the system will automatically restart, I didn't modify any configuration.To Reproduce
In my case, after the upgrade, I'm not sure how to reproduce.
Expected behavior
no kernel panic
Relevant log files
Environment
OPNsense 23.1.9 - amd64, OpenSSL over Proxmox VE (i440fx cpu: host) Intel(R) Xeon(R) CPU E3-1265L v3 @ 2.50GHz (4 cores, 4 threads) Network VirtIO