Closed rklaehn closed 1 year ago
In theory this should be doable over kernel loopback, but there's an unexplained effect leading to increased packet loss once we get past 6000 or so. Tangentially related: #69
Did some more experiments. I found that setting max_udp_payload_size
on the EndpoingConfig as well makes a huge difference on mac, while having almost no effect on linux. I assume that is because on linux GSO / GRO features.
In any case, here are some benchmarks on M1 Mac with both values adjusted:
9200, best value on OSX
Stream download metrics:
│ Throughput │ Duration
──────┼───────────────┼──────────
AVG │ 1258.50 MiB/s │ 813.00ms
P0 │ 1258.00 MiB/s │ 813.00ms
P10 │ 1259.00 MiB/s │ 813.00ms
P50 │ 1259.00 MiB/s │ 813.00ms
P90 │ 1259.00 MiB/s │ 813.00ms
P100 │ 1259.00 MiB/s │ 813.00ms
1200, default:
│ Throughput │ Duration
──────┼───────────────┼──────────
AVG │ 389.12 MiB/s │ 2.63s
P0 │ 389.00 MiB/s │ 2.63s
P10 │ 389.25 MiB/s │ 2.63s
P50 │ 389.25 MiB/s │ 2.63s
P90 │ 389.25 MiB/s │ 2.63s
P100 │ 389.25 MiB/s │ 2.63s
This is getting to the point where the encryption actually makes a noticeable difference. Here is the speed with disabling encryption (using quinn-noise and commenting out the encryption and decryption for keys):
Stream download metrics:
│ Throughput │ Duration
──────┼───────────────┼──────────
AVG │ 2287.00 MiB/s │ 447.00ms
P0 │ 2286.00 MiB/s │ 447.00ms
P10 │ 2288.00 MiB/s │ 447.00ms
P50 │ 2288.00 MiB/s │ 447.00ms
P90 │ 2288.00 MiB/s │ 447.00ms
P100 │ 2288.00 MiB/s │ 447.00ms
Since it seems clear that Quinn itself handles large UDP datagrams well, I'm going to close this as not actionable. Note that customizing the default max_udp_payload_size
is required if your network supports larger values than the typical Ethernet MTU.
I did some benchmarking with a quinn based RPC framework. https://github.com/n0-computer/quic-rpc .
I found that unsurprisingly, the packet size has a huge influence on throughput. Here are benchmarks on a linux box with different values for
initial_max_udp_payload_size
:The default:
Largest value that was working for me:
I found that with a sufficiently large packet size quinn bulk can outperform TCP with default settings, but with the default quinn settings it is slower than TCP (about 2/3).
So it would be nice to ensure that quinn works well in an environment that allows large frames, e.g. a LAN with jumbo frames enabled, or a loopback device with a large MTU.
I think a good way to do this would be to implement a dummy in memory AbstractUdpSocket transport that has basically zero overhead, then make sure quinn works well with large values of initial-mtu up to 65536, or at least up to the 9000 bytes of jumbo frames.