opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.35k stars 751 forks source link

CPU Load - Opnsense/BSD doesn't shows. #4785

Closed Ramalama2 closed 3 years ago

Ramalama2 commented 3 years ago

Opnsense > Newest, Up2Date

My Network:

Workload:

Issue:

Pictures:

Ramalama2 commented 3 years ago

Okay, i have digged it further down:

So in short, i use now virtio nics, instead of a more professional solution. Cause of FreeBSD bugs again. Or switching to a linux solution, but there is nothing comparable to opnsense. Ipfire is a pain & openwrt probably either.

The only real way is any vanilla linux vm... Without any nice gui and configure yourself everything.

Hopefully we get one day a linux fork, similar to opnsense/pfsense, that would get insanely popular. But i guess, the bsd license is more attractive as the gpl license πŸ€¦β€β™‚οΈ

Cheers

Ramalama2 commented 3 years ago

After some more digging, i think this is related to this: https://github.com/opnsense/src/issues/104

Since sr-iov virtual functions are passed through with pci instead of pcie πŸ€¦β€β™‚οΈ So the ixgbevf module causes some overhead.

But aside from the passthrough issue, bsd/opnsense still should show the actual cpu load. And it doesn't. See the pictures above.

However, this won't get fixed anyway here. Even with the provided patch in the other thread. Cheers

Ramalama2 commented 3 years ago

An update.

I found something very weird out, while i upgraded my switch and the overall wiring. (That's how i noticed this at all)

Test 1: (iperf3 between pc1 & pc2) Tests with sr-iov virtual function, iperf3 between pc1 on vlan 25 and pc2 on vlan 27 (pc1 & pc2 have only a gigabit nic. Opnsense+pc1+pc2 are connected through a 10gb/s switch) If the linkspeed (physically) is connected with:

Test 2: (iperf3 between pc1 & opnsense) This gets even weirder, same conditions as in test 1, but an iperf3 test from pc1 directly to opnsense:

This looks for me like definitively a bug, everything else makes no sense at all. Ips/ids/surricata/ and all fancy packet inspection stuff is turned off. Firewall rules are almost non-existent, only basic rules to allow everything. Offloading (any on/off) makes no difference. So this is almost a clean routing only test.

So this is probably something that people don't even notice on bare metal opnsense installs.


I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md I have searched the existing issues and I am convinced that mine is new.

Opnsense version: Latest 21.1.3

Hopefully this is now "compled" and can be tagged as "bug".

Cheers

fichtner commented 3 years ago

If you cross packets from kernel to userspace because you run the speed test on the opnsense box you see a lot more load due to context switches, grabbing packets, moving them to userspace and back to kernel... In the routing case they never leave the kernel network stack. This is not strange. :)

I think we always recommend to not run speed tests on the box itself. What you are testing is not a firewall, it's a VM with a service like web or mail...

Cheers, Franco

Ramalama2 commented 3 years ago

Aah, i understand and makes sense now πŸ€¦β€β™‚οΈ

Well in my logic, as the traffic crosses only one interface (pc1 to opnsense), instead of 2 interfaces (pc1 - opnsense - pc2), i thinked it should use less cpu load πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚

However, after your reply, i camed to another idea, if there is probably something wrong with kvm itself, or kvm with freebsd... Since other kvm vm's don't have any impact, i wrote above already with an ubuntu instance, but i tested vyos too and vyos doesn't produce any load either. I could test with pfsense or nacked freebsd too. But this all would be routing tests again...

I would like to make a small load test on the opnsense instance itself, that has nothing todo with routing. And things like "stress" will produce only a maximal cpu load anyway. Is there any way to make like 10-15% cpu load on the opnsense itself? Then i would compare it with what the real cpu usage is....

I hope it's understandable πŸ™ˆ

fichtner commented 3 years ago

Sure, you are looking for https://www.freshports.org/benchmarks/stress-ng/

#Β pkg install stress-ng

Cheers, Franco

Ramalama2 commented 3 years ago

wow, that worked really good, the load matches almost exactly πŸ™ˆ

So well, it has definitively something with the routing/drivers/etc todo πŸ€¦β€β™‚οΈ

If i can make some other tests ir whatever i can provide, let me know.

Thanks!

fichtner commented 3 years ago

I think as soon as the driver specifics kick in for networking there can be a lot of redundant load or I/O wait depending on the implementation. There isn’t much we can do other than go to next FreeBSD releases and see if that situation improves. Some driver options work better than others. May be worth trying what works best for you.

In terms of support this is all I can offer.

Cheers, Franco

Ramalama2 commented 3 years ago

Well, i can try freebsd 13 myself, it will get released anyway shortly. And glad god there a is q35 fix.

However, will try it out and report back.

I don't expected anything anyway and don't expect you to provide support. I mean, i can move my butt and try to find out exactly what the problem is and fix it or workaround. So yeah, it's all good. Thanks very much for reply's.

If anyone has the ability to try out virtual functions (ixgbe / ivx) based, on a hypervisor (kvm if possible), please test this too πŸ™ˆ Any result is good, if you have same issue or if its working perfectly, both cases helps. Thanks ✌️

Ramalama2 commented 3 years ago

Aah, and for the ivx driver itself. I tryed out the default opnsense one, that has a weird version number (cause higher as intels) And compiled intels latest v1.5.25 ivx module.

Both behaves exactly same.

The only difference that I don't understand at all is, the ont that is included into opnsense, is 263kb big, if i remember right. The intel one, that i compiled (1.5.25) is like 15kb in size.

Both are working fine or not fine (depends on this issue here). Just the size difference is so dramatic, that I don't understand why πŸ€¦β€β™‚οΈπŸ˜‚

Edit: i mean ixv, not ivx πŸ€¦β€β™‚οΈ

OPNsense-bot commented 3 years ago

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.

Ramalama2 commented 1 year ago

Hex @fichtner

Sorry for the late reply, but i just stumbled accidentally over this again.

However, everything is fixed! In the meantime i even switched from SR-IOV (Virtual functions) ixv to Physical Function ix0. Means im not passing anymore the virtual function from the x550 card, im passing now the entire Physical Adapter/Port through.

However, VF's worked already great if i passthrough as PCIe with Q35, i think that happened since Opnsense was based on Freebsd 13.1. PF's now works great either.

There is no big overhead anymore, if i do the testings again, like i did 2 Years ago, well there is a little overhead of like 2%.

But to be said, the whole situation changed, if i speedtest now with iperf at 10Gb speed, the total cpu consumption is at 5% at max. (inside opnsense and what proxmox reports) But thats with all Hardware Acceleration stuff enabled inside OpnSense.

About the Hardware Acceleration Stuff, i can confirm that without IDS/Surricata/etc, everything works perfectly. I mean im using 5 vlan's (on the same adapter) with a lot of firewall rules and everything works absolutely perfect.

The only strange issue i camed across, is enabling all hardware offloading does decrease the Latency. Network Speed itself is fine and everything, just if i do a speedtest, the Latency increases from 4ms to 6ms. And sometimes my speedtest show a ping of 4ms (so same latency) However, i just wanted to say, that hardware offloading works with the ix0 driver for x550-t2 perfectly fine. Hope that helps someone.

I just have to play myself and Check what exactly causes the Latency increase, if its CRC/LRO/TCP or Vlan Hardware Filtering. Otherwise i don't need IDS/Surricata here etc anyway, so no reason to not use it.

Thanks @fichtner btw for the replyes, basically that you spended your time here a year ago, reading my crap xD

Cheers!