Closed Ramalama2 closed 3 years ago
Okay, i have digged it further down:
Workload: Transfer over smb an 80gb file at 1GB/s speed:
(opnsense vm) - sr-iov (virtual functions) --- BSD bug? CPU Load: 150-200% --- Disabling/Enabling any offloading doesn't help
(opnsense vm) - Virtio nics (disabled all offloading) --- CPU Load: 20-30%
(Ubuntu 20.04 vm) - sr-iov (virtual functions) --- CPU Load: 1-3%
(Ubuntu 20.04 vm) - Virtio nics (LRO disabled / CRC+TSO enabled ) --- CPU Load: 4-6%
So in short, i use now virtio nics, instead of a more professional solution. Cause of FreeBSD bugs again. Or switching to a linux solution, but there is nothing comparable to opnsense. Ipfire is a pain & openwrt probably either.
The only real way is any vanilla linux vm... Without any nice gui and configure yourself everything.
Hopefully we get one day a linux fork, similar to opnsense/pfsense, that would get insanely popular. But i guess, the bsd license is more attractive as the gpl license π€¦ββοΈ
Cheers
After some more digging, i think this is related to this: https://github.com/opnsense/src/issues/104
Since sr-iov virtual functions are passed through with pci instead of pcie π€¦ββοΈ So the ixgbevf module causes some overhead.
But aside from the passthrough issue, bsd/opnsense still should show the actual cpu load. And it doesn't. See the pictures above.
However, this won't get fixed anyway here. Even with the provided patch in the other thread. Cheers
An update.
I found something very weird out, while i upgraded my switch and the overall wiring. (That's how i noticed this at all)
Opnsense is connected only over one interface (with multiple vlans)
Im using the x550 sr-iov virtual function in the opnsense again.
Opnsense: KVM on Proxmox (Cpu set as host or epyc for opnsense, real cpu: Ryzen 5800x, ballooning off, ram for opnsense: 4gb, 32gb nvme storage)
This explains why (see previous posts) kvm virtio nic used less cpu power inside opnsense (virtio reports 10gb/s linkspeed)
Test 1: (iperf3 between pc1 & pc2) Tests with sr-iov virtual function, iperf3 between pc1 on vlan 25 and pc2 on vlan 27 (pc1 & pc2 have only a gigabit nic. Opnsense+pc1+pc2 are connected through a 10gb/s switch) If the linkspeed (physically) is connected with:
Test 2: (iperf3 between pc1 & opnsense) This gets even weirder, same conditions as in test 1, but an iperf3 test from pc1 directly to opnsense:
This looks for me like definitively a bug, everything else makes no sense at all. Ips/ids/surricata/ and all fancy packet inspection stuff is turned off. Firewall rules are almost non-existent, only basic rules to allow everything. Offloading (any on/off) makes no difference. So this is almost a clean routing only test.
So this is probably something that people don't even notice on bare metal opnsense installs.
I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md I have searched the existing issues and I am convinced that mine is new.
Opnsense version: Latest 21.1.3
Hopefully this is now "compled" and can be tagged as "bug".
Cheers
If you cross packets from kernel to userspace because you run the speed test on the opnsense box you see a lot more load due to context switches, grabbing packets, moving them to userspace and back to kernel... In the routing case they never leave the kernel network stack. This is not strange. :)
I think we always recommend to not run speed tests on the box itself. What you are testing is not a firewall, it's a VM with a service like web or mail...
Cheers, Franco
Aah, i understand and makes sense now π€¦ββοΈ
Well in my logic, as the traffic crosses only one interface (pc1 to opnsense), instead of 2 interfaces (pc1 - opnsense - pc2), i thinked it should use less cpu load πππππππ
However, after your reply, i camed to another idea, if there is probably something wrong with kvm itself, or kvm with freebsd... Since other kvm vm's don't have any impact, i wrote above already with an ubuntu instance, but i tested vyos too and vyos doesn't produce any load either. I could test with pfsense or nacked freebsd too. But this all would be routing tests again...
I would like to make a small load test on the opnsense instance itself, that has nothing todo with routing. And things like "stress" will produce only a maximal cpu load anyway. Is there any way to make like 10-15% cpu load on the opnsense itself? Then i would compare it with what the real cpu usage is....
I hope it's understandable π
Sure, you are looking for https://www.freshports.org/benchmarks/stress-ng/
#Β pkg install stress-ng
Cheers, Franco
wow, that worked really good, the load matches almost exactly π
So well, it has definitively something with the routing/drivers/etc todo π€¦ββοΈ
If i can make some other tests ir whatever i can provide, let me know.
Thanks!
I think as soon as the driver specifics kick in for networking there can be a lot of redundant load or I/O wait depending on the implementation. There isnβt much we can do other than go to next FreeBSD releases and see if that situation improves. Some driver options work better than others. May be worth trying what works best for you.
In terms of support this is all I can offer.
Cheers, Franco
Well, i can try freebsd 13 myself, it will get released anyway shortly. And glad god there a is q35 fix.
However, will try it out and report back.
I don't expected anything anyway and don't expect you to provide support. I mean, i can move my butt and try to find out exactly what the problem is and fix it or workaround. So yeah, it's all good. Thanks very much for reply's.
If anyone has the ability to try out virtual functions (ixgbe / ivx) based, on a hypervisor (kvm if possible), please test this too π Any result is good, if you have same issue or if its working perfectly, both cases helps. Thanks βοΈ
Aah, and for the ivx driver itself. I tryed out the default opnsense one, that has a weird version number (cause higher as intels) And compiled intels latest v1.5.25 ivx module.
Both behaves exactly same.
The only difference that I don't understand at all is, the ont that is included into opnsense, is 263kb big, if i remember right. The intel one, that i compiled (1.5.25) is like 15kb in size.
Both are working fine or not fine (depends on this issue here). Just the size difference is so dramatic, that I don't understand why π€¦ββοΈπ
Edit: i mean ixv, not ivx π€¦ββοΈ
This issue has been automatically timed-out (after 180 days of inactivity).
For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.
If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.
Hex @fichtner
Sorry for the late reply, but i just stumbled accidentally over this again.
However, everything is fixed! In the meantime i even switched from SR-IOV (Virtual functions) ixv to Physical Function ix0. Means im not passing anymore the virtual function from the x550 card, im passing now the entire Physical Adapter/Port through.
However, VF's worked already great if i passthrough as PCIe with Q35, i think that happened since Opnsense was based on Freebsd 13.1. PF's now works great either.
There is no big overhead anymore, if i do the testings again, like i did 2 Years ago, well there is a little overhead of like 2%.
But to be said, the whole situation changed, if i speedtest now with iperf at 10Gb speed, the total cpu consumption is at 5% at max. (inside opnsense and what proxmox reports) But thats with all Hardware Acceleration stuff enabled inside OpnSense.
About the Hardware Acceleration Stuff, i can confirm that without IDS/Surricata/etc, everything works perfectly. I mean im using 5 vlan's (on the same adapter) with a lot of firewall rules and everything works absolutely perfect.
The only strange issue i camed across, is enabling all hardware offloading does decrease the Latency. Network Speed itself is fine and everything, just if i do a speedtest, the Latency increases from 4ms to 6ms. And sometimes my speedtest show a ping of 4ms (so same latency) However, i just wanted to say, that hardware offloading works with the ix0 driver for x550-t2 perfectly fine. Hope that helps someone.
I just have to play myself and Check what exactly causes the Latency increase, if its CRC/LRO/TCP or Vlan Hardware Filtering. Otherwise i don't need IDS/Surricata here etc anyway, so no reason to not use it.
Thanks @fichtner btw for the replyes, basically that you spended your time here a year ago, reading my crap xD
Cheers!
Opnsense > Newest, Up2Date
My Network:
Bare metal: proxmox 6.3 (5800x/64gb ram) -- Opnsense VM without virtio nics, only 2 passed through Virtual Functions from x550 -- Offloading: LRO (disabled) / CRC+TSO (enabled) --- Nic1: (2,5GB/s Link) ----- UTag 25: SRV Net (192.168.25.0/24) --- Nic2: (2,5GB/s Link) ----- Tag 24: WAN_Transfer (pppoe) ----- Tag 27: USER Net (192.168.27.0/24) ----- Tag 28: WLAN Net (192.168.28.0/24)
Computer: W10 (5600x/64gb ram) --- Nic1: (1GB/s Link) ----- UTag 27: USER Net
Workload:
Issue:
Pictures: