mozilla / neqo

Neqo, the Mozilla Firefox implementation of QUIC in Rust
https://firefox-source-docs.mozilla.org/networking/http/http3.html
Apache License 2.0
1.83k stars 123 forks source link

feat: Helpers to get the local interface MTU to a target IP address #2073

Closed larseggert closed 4 weeks ago

larseggert commented 1 month ago

WIP

github-actions[bot] commented 1 month ago

Failed Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

github-actions[bot] commented 1 month ago

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

larseggert commented 1 month ago

Do we really want to run this sort of function once per connection? Is there a design you can come up with that ensures that this only runs once?

I think we do need to run this once per new path, since the OS routing table might change independently of us, causing the MTU to a destination to change dynamically. (Think of somebody pulling up a VPN with a default route into it and a lower MTU.)

martinthomson commented 1 month ago

I know that poking at the OS a few times per connection/path is not likely to cost that much, but it seems wasteful when in most cases the local interface MTU has no bearing on the end state.

larseggert commented 1 month ago

I mean, we could cache this value and only probe once in a while, but that would mean that if the cached value is too large, we PTO a bunch of times when opening the connection until we determine we should start PMTU from the lowest value. It's a tradeoff.

in most cases the local interface MTU has no bearing on the end state.

Not sure I follow - the local MTU is likely the path MTU in most cases, so starting from it should usually give a performance boost for the handshake?

martinthomson commented 1 month ago

Why would we start with the assumption that the local MTU is the path MTU? I have a 1500 byte MTU on my local link according to the OS, but I've never managed to get that much through. Wouldn't we want to start low and then grow. The local link MTU is likely a stopping point, not a starting point.

larseggert commented 1 month ago

Really, you don't get 1500 end to end (IP packet size)? What uplink do you have?

larseggert commented 1 month ago

TCP starts with the local MTU.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 95.36%. Comparing base (6f8823b) to head (0a6e0ad). Report is 2 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2073 +/- ## ======================================= Coverage 95.36% 95.36% ======================================= Files 112 112 Lines 36530 36531 +1 ======================================= + Hits 34836 34837 +1 Misses 1694 1694 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

larseggert commented 1 month ago

So this seems to finally work also on Windows.

I wonder if this should be in it's own crate, or maybe in neqo-udp if that might be intended to be a more generic crate?

After this has landed, I plan to do another PR to use this function to optimistically start with the interface MTU at the beginning of the handshake, and fall back to minimum-sized packets during the handshake followed bu regular PMTUD if our optimism was misguided.

github-actions[bot] commented 1 month ago

Benchmark results

Performance differences relative to 7139f5792e20270cc36ced4af37c11c5d3733bf9.

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [99.016 ns 99.286 ns 99.559 ns]
       change: [-0.1123% +0.3550% +0.8133%] (p = 0.15 > 0.05)

Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [116.90 ns 117.12 ns 117.38 ns]
       change: [-0.4631% -0.1325% +0.1867%] (p = 0.46 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  7 (7.00%) high severe
coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [116.56 ns 117.03 ns 117.67 ns]
       change: [-1.3882% -0.6445% -0.0049%] (p = 0.07 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) low severe
  3 (3.00%) low mild
  1 (1.00%) high mild
  4 (4.00%) high severe
coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [96.894 ns 97.065 ns 97.252 ns]
       change: [-12.785% -4.9390% -0.2058%] (p = 0.22 > 0.05)

Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
RxStreamOrderer::inbound_frame(): No change in performance detected.
       time:   [111.97 ms 112.09 ms 112.24 ms]
       change: [-0.0599% +0.0967% +0.2526%] (p = 0.26 > 0.05)

Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe
transfer/pacing-false/varying-seeds: No change in performance detected.
       time:   [26.688 ms 27.669 ms 28.672 ms]
       change: [-4.2788% +0.7155% +6.3729%] (p = 0.79 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
transfer/pacing-true/varying-seeds: No change in performance detected.
       time:   [34.137 ms 35.804 ms 37.509 ms]
       change: [-9.9526% -4.0715% +2.1525%] (p = 0.23 > 0.05)
transfer/pacing-false/same-seed: No change in performance detected.
       time:   [32.020 ms 32.669 ms 33.307 ms]
       change: [-1.8527% +1.2131% +4.4322%] (p = 0.46 > 0.05)
transfer/pacing-true/same-seed: No change in performance detected.
       time:   [41.338 ms 44.339 ms 47.291 ms]
       change: [-7.5907% +1.6639% +12.527%] (p = 0.76 > 0.05)
1-conn/1-100mb-resp (aka. Download)/client: No change in performance detected.
       time:   [113.16 ms 113.60 ms 114.02 ms]
       thrpt:  [877.06 MiB/s 880.29 MiB/s 883.72 MiB/s]
change:
       time:   [-0.9630% -0.4083% +0.1828%] (p = 0.16 > 0.05)
       thrpt:  [-0.1824% +0.4100% +0.9723%]

Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
1-conn/10_000-parallel-1b-resp (aka. RPS)/client: Change within noise threshold.
       time:   [312.16 ms 315.91 ms 319.64 ms]
       thrpt:  [31.285 Kelem/s 31.655 Kelem/s 32.034 Kelem/s]
change:
       time:   [-3.7931% -2.2433% -0.6918%] (p = 0.01 < 0.05)
       thrpt:  [+0.6966% +2.2948% +3.9427%]
1-conn/1-1b-resp (aka. HPS)/client: :broken_heart: Performance has regressed.
       time:   [34.145 ms 34.366 ms 34.605 ms]
       thrpt:  [28.898  elem/s 29.099  elem/s 29.287  elem/s]
change:
       time:   [+4.7882% +5.7024% +6.5822%] (p = 0.00 < 0.05)
       thrpt:  [-6.1757% -5.3947% -4.5694%]

Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  7 (7.00%) high severe

Client/server transfer results

Transfer of 33554432 bytes over loopback. Client Server CC Pacing Mean [ms] Min [ms] Max [ms] Relative
msquic msquic 101.0 ± 27.6 82.9 194.4 1.00
neqo msquic reno on 223.4 ± 19.8 205.1 274.5 1.00
neqo msquic reno 218.9 ± 25.4 204.6 304.2 1.00
neqo msquic cubic on 230.6 ± 25.7 207.9 296.9 1.00
neqo msquic cubic 220.9 ± 15.3 203.8 250.2 1.00
msquic neqo reno on 90.2 ± 24.2 73.8 175.1 1.00
msquic neqo reno 81.6 ± 9.9 74.5 106.1 1.00
msquic neqo cubic on 90.5 ± 26.4 74.2 183.6 1.00
msquic neqo cubic 84.5 ± 17.2 72.9 154.3 1.00
neqo neqo reno on 184.2 ± 98.9 130.3 422.5 1.00
neqo neqo reno 174.8 ± 66.9 133.0 408.4 1.00
neqo neqo cubic on 199.3 ± 84.6 128.2 398.1 1.00
neqo neqo cubic 156.2 ± 21.9 127.3 196.8 1.00

:arrow_down: Download logs

larseggert commented 1 month ago

I am in favor of making this a standalone crate published to crates.io. Given that it can be useful on its own, I don't think we should merge it with neqo-udp.

Anyone got any ideas for a good crate name?

mxinden commented 1 month ago

I am in favor of making this a standalone crate published to crates.io. Given that it can be useful on its own, I don't think we should merge it with neqo-udp.

Anyone got any ideas for a good crate name?

if-udp, udp-mtu or mtu?

https://crates.io/crates/if-mtu https://crates.io/crates/mtu https://crates.io/crates/udp-mtu

larseggert commented 1 month ago

mtu is shortest :-)

larseggert commented 4 weeks ago

Closing, since this is now in https://github.com/mozilla/mtu