Closed cbrueffer closed 3 years ago
@cbrueffer thanks a lot! the original commit went to releng/12.1 but not the fix. It's only on releng/12.2. See b9315bd38115
Not sure if this makes 21.1.3 next week, but if you want I can provide a test kernel today.
Cheers, Franco
That would be great @fichtner, thanks!
Here you go:
# opnsense-update -zbkr 21.7.a_40
If it checks out please close the issue and we will ponder about the backport urgency internally.
Thanks, Franco
Thanks! The issues I'm seeing are a bit unpredictable, so it may take a few days before I can reliably say whether it helped.
No problem :)
On 3. Mar 2021, at 15:04, Christian Brueffer notifications@github.com wrote:
Thanks! The issues I'm seeing are a bit unpredictable, so it may take a few days before I can reliably say whether it helped.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Stupid question, but:
root@gw:~ # opnsense-update -bkr 21.7.a_40
Fetching base-21.7.a_40-amd64.txz: .. failed, no signature found
root@gw:~ # opnsense-update -bikr 21.7.a_40
Fetching base-21.7.a_40-amd64.txz: .. failed, no update found
What's the best way to make this work?
remove the b :) [-kr
]
Same thing:
root@gw:~ # opnsense-update -kr 21.7.a_40
Fetching kernel-21.7.a_40-amd64.txz: .. failed, no signature found
root@gw:~ # opnsense-update -ikr 21.7.a_40
Fetching kernel-21.7.a_40-amd64.txz: .. failed, no update found
ok, it looks like their published to snapshots (both base and kernel). can you try:
opnsense-update -bkzr 21.7.a_40
That appears to be working; thanks!
Sorry, I do not heed my own safeguard additions. Ad is correct, -z is used to select snapshots which this is... :)
The patch unfortunately hasn't solved my specific problem, but it also hasn't been detrimental.
Considering the severe symptoms for some people described in the pfSense forum thread it may be good to include in 21.1.3 nontheless.
TBH, we haven't seen the issues described there and 2.4.5 is not even 12.1 so it may have been another issue with the backport to 11 maybe? It doesn't look like a smooth sail to 21.1.3 if it adds no value.
What symptoms are you experiencing? Since 20.7 I guess? Or 21.1? It's not clear from the report...
Cheers, Franco
The problem is described in https://forum.opnsense.org/index.php?topic=21145.0; basically I'm seeing recurrent 30-50 second network outages on one APU2D4 igb(4) interface carrying three VLANs. While I'm not 100% sure our OPNsense router is at fault, it does increasingly look like it (no other colo customers experience this problem).
I've had the first reports when I was using 20.7.4, but it may have occurred before that. Some of the symptoms described in the pfSense thread sounded similar to what I'm seeing, which brought me to this patch.
Like I wrote on the forum, I'm a bit suspicious of the iflib'ified igb(4). There have been several bugfixes in that area not currently in OPNsense, so my weekend project is testing those and seeing how it looks. For easier testing, is there anything in OPNsense that likely wouldn't work with a stock FreeBSD 12-STABLE kernel?
Edit: I should note that I used 19.7.X and 20.1.X on the same hardware and setting without problems.
This can be closed, OPNsense was not at fault for the mentioned VLAN issue.
Thanks for the follow up! 😊
FreeBSD r345177 moved pf stats to Counter variables. This introduced some CPU load and system stability issues in SMP environments with large pf tables which were fixed in a followup commit (HEAD r360903, r361451 in 12-STABLE) and could only be worked around by disabling SMP.
OPNsense contains the original pf Counter commit, but not the subsequent fix.
pfSense forum thread where the symptoms around this issue were originally described: https://forum.netgate.com/topic/149595/2-4-5-a-20200110-1421-and-earlier-high-cpu-usage-from-pfctl/71?lang=en-US