Open bernhardschmidt opened 1 year ago
@bernhardschmidt thanks for sharing. Are clients connecting via TCP?
There are UDP and TCP processes, but the test client connected over UDP.
Oh. This is interesting. Technically there was no change on the UDP fast path between dco and dco-v2 (while there were for TCP). I wonder if there is something outside of DCO playing a role here mh.
I will run some extra tests on my side. Maybe after your vacation you can try to reproduce it with the test rig, to have comparable results.
Thanks so far!
Hi @bernhardschmidt ! I was wondering if you had any chance to run some extra test to shed some lights on your previously reported results. Thanks!
I'm observing something similar.
Test setup:
veth
interfaces (not via netns-test.sh
from the repo)OpenVPN configuration:
2 GBit/s.
-----------------------------------------------------------
Server listening on 5201 (test #3)
-----------------------------------------------------------
Accepted connection from 192.168.18.2, port 49638
[ 5] local 192.168.18.1 port 5201 connected to 192.168.18.2 port 49642
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 229 MBytes 1.92 Gbits/sec
[ 5] 1.00-2.00 sec 237 MBytes 1.99 Gbits/sec
[ 5] 2.00-3.00 sec 236 MBytes 1.98 Gbits/sec
[ 5] 3.00-4.00 sec 238 MBytes 2.00 Gbits/sec
[ 5] 4.00-5.00 sec 238 MBytes 2.00 Gbits/sec
[ 5] 5.00-6.00 sec 238 MBytes 2.00 Gbits/sec
[ 5] 6.00-7.00 sec 233 MBytes 1.96 Gbits/sec
[ 5] 7.00-8.00 sec 234 MBytes 1.96 Gbits/sec
[ 5] 8.00-9.00 sec 236 MBytes 1.98 Gbits/sec
[ 5] 9.00-10.00 sec 233 MBytes 1.95 Gbits/sec
[ 5] 10.00-10.00 sec 128 KBytes 1.57 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 2.30 GBytes 1.97 Gbits/sec receiver
100% CPU load of both server and client processes.
Only 1 Gbit/s.
-----------------------------------------------------------
Accepted connection from 192.168.18.2, port 46208
[ 5] local 192.168.18.1 port 5201 connected to 192.168.18.2 port 46222
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 133 MBytes 1.12 Gbits/sec
[ 5] 1.00-2.00 sec 148 MBytes 1.24 Gbits/sec
[ 5] 2.00-3.00 sec 130 MBytes 1.09 Gbits/sec
[ 5] 3.00-4.00 sec 137 MBytes 1.15 Gbits/sec
[ 5] 4.00-5.00 sec 144 MBytes 1.21 Gbits/sec
[ 5] 5.00-6.00 sec 132 MBytes 1.10 Gbits/sec
[ 5] 6.00-7.00 sec 141 MBytes 1.18 Gbits/sec
[ 5] 7.00-8.00 sec 140 MBytes 1.18 Gbits/sec
[ 5] 8.00-9.00 sec 128 MBytes 1.07 Gbits/sec
[ 5] 9.00-10.00 sec 116 MBytes 969 Mbits/sec
[ 5] 10.00-10.01 sec 896 KBytes 1.42 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.01 sec 1.32 GBytes 1.13 Gbits/sec receiver
No significant CPU load during the test. There's a lock somewhere.
TCP connection without DCO shows 1.5 Gbit/s, with DCO — 742 mbit/s.
DCO module doesn't support TCP MSS value modification (mssfix)? The packets are sent with 1460 MSS.
Executing ip route add 192.168.18.2 dev tun1 advmss 1400
from both namespaces gives slight speed boost, but by not much.
[ 5] local 192.168.18.2 port 40108 connected to 192.168.18.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 169 MBytes 1.42 Gbits/sec 0 298 KBytes
[ 5] 1.00-2.00 sec 142 MBytes 1.19 Gbits/sec 0 716 KBytes
[ 5] 2.00-3.00 sec 133 MBytes 1.12 Gbits/sec 0 653 KBytes
[ 5] 3.00-4.00 sec 188 MBytes 1.57 Gbits/sec 0 347 KBytes
[ 5] 4.00-5.00 sec 182 MBytes 1.53 Gbits/sec 0 569 KBytes
[ 5] 5.00-6.00 sec 147 MBytes 1.23 Gbits/sec 0 317 KBytes
[ 5] 6.00-7.00 sec 147 MBytes 1.23 Gbits/sec 0 298 KBytes
[ 5] 7.00-8.00 sec 153 MBytes 1.28 Gbits/sec 0 399 KBytes
[ 5] 8.00-9.00 sec 152 MBytes 1.28 Gbits/sec 0 783 KBytes
[ 5] 9.00-10.00 sec 152 MBytes 1.28 Gbits/sec 0 5.42 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.53 GBytes 1.31 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 1.53 GBytes 1.31 Gbits/sec receiver
DCO module doesn't support TCP MSS value modification (mssfix)? The packets are sent with 1460 MSS.
correct, mssfix is currently not supported.
Have you tried the same test after directly lowering the MTU of the DCO interface? By the way, please note that this ticket was about DCO v1 vs DCO v2. Mssfix was never supported, so this is not a regression.
Hi, I had the same issue.
It seems that the problem is the default MTU of the server config do not play nice when a connection is made using DCO. This problem does not exist with a client that does not use DCO. I am guessing tun-mtu
that is too high leads to fragmenting or other issues which significantly reduce the bandwidth.
I have run a few tests. These were done with Debian bookworm using openvpn-dco-dkms 0.0+git20231103-1 and openvpn 2.6.3-1+deb12u2. I never had DCO V1, so I can't comment on that.
[ 1] local 10.8.0.1 port 5001 connected with 10.8.0.2 port 42354 (icwnd/mss/irtt=13/1348/15058)
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-20.1121 sec 583 MBytes 243 Mbits/sec # tun-mtu 1400 in server config
[ 2] local 10.8.0.1 port 5001 connected with 10.8.0.2 port 37804 (icwnd/mss/irtt=13/1358/17719)
[ ID] Interval Transfer Bandwidth
[ 2] 0.0000-20.0808 sec 583 MBytes 244 Mbits/sec # tun-mtu 1410
[ 3] local 10.8.0.1 port 5001 connected with 10.8.0.2 port 36144 (icwnd/mss/irtt=13/1368/15711)
[ ID] Interval Transfer Bandwidth
[ 3] 0.0000-20.0941 sec 584 MBytes 244 Mbits/sec # tun-mtu 1420
[ 4] local 10.8.0.1 port 5001 connected with 10.8.0.2 port 45780 (icwnd/mss/irtt=13/1378/18134)
[ ID] Interval Transfer Bandwidth
[ 4] 0.0000-20.2265 sec 157 MBytes 65.1 Mbits/sec # tun-mtu 1430
[ 5] local 10.8.0.1 port 5001 connected with 10.8.0.2 port 39998 (icwnd/mss/irtt=14/1448/17808)
[ ID] Interval Transfer Bandwidth
[ 5] 0.0000-20.3171 sec 158 MBytes 65.3 Mbits/sec # tun-mtu commented out in the server config
[ 6] local 10.8.0.1 port 5001 connected with 10.8.0.2 port 45944 (icwnd/mss/irtt=13/1348/17844)
[ ID] Interval Transfer Bandwidth
[ 6] 0.0000-20.1149 sec 585 MBytes 244 Mbits/sec # tun-mtu 1400 on the server again
basically the connection is at about 100% bandwith when DCO is used with a tun-mtu
smaller than 1420 and pretty much exactly 65Mbit with the default or anything above 1420. Once it is set correcty the bandwith does improve over non-dco tun and at significantly lower CPU usage.
Maybe put a note in the documentation that you have to set a lower tun-mtu
to avoid this problem, since the reference manual clearly says to leave that parameter alone.
you're right about the MTU, but this has always been the case since the beginning of DCO, hence not strictly related to this issue.
The issue lies in the mssfix directive not being supported by DCO. It will most likely be implemented in the near future.
It will most likely be implemented in the near future.
good news :-)
Upgrading our eduVPN node to OpenVPN 2.6.2 with ovpn_dco_v2 leads to significantly lower throughput than running without DCO or with DCOv1. Unfortunately I cannot get the test rig used for the first test running today and I'm going on vacation, but it is reproducible.
Speeds are downloads from the VPN client, so sending into the tunnel.
All with the same VM: 2.6.1 + DCOv1: 1,23 Gbit/s 2.6.1 without DCO: 374 Mbit/s
From another client (!), thus not comparable with the numbers above: 2.6.2 + DCOv2: 84 Mbit/s 2.6.2 without DCO: 450 Mbit/s