cloudflare / quiche

🥧 Savoury implementation of the QUIC transport protocol and HTTP/3
https://docs.quic.tech/quiche/
BSD 2-Clause "Simplified" License
9.4k stars 709 forks source link

Error handling key updates? #1273

Closed adewes closed 1 year ago

adewes commented 2 years ago

When sending data through a QUIC stream the connection always fails at the 99.999th packet received, giving me "dropped invalid packet" errors. It seems this is due to a key update initiated by the server (based on quic-go). I couldn't figure out how to fix this, is Quiche maybe not handling key updates correctly in this case? I use the following client configuration:

config.set_max_idle_timeout(300000);
config.set_max_recv_udp_payload_size(1350);
config.set_max_send_udp_payload_size(1350);
config.set_initial_max_data(10_000_000);
config.set_initial_max_stream_data_bidi_local(10_000_000);
config.set_initial_max_stream_data_bidi_remote(10_000_000);
config.set_initial_max_streams_bidi(1000);
config.set_initial_max_streams_uni(1000);
config.set_disable_active_migration(false);

Here's the log output leading to the error. After receiving the invalid packets the connection eventually times out. Log entries with name "xxxxxxxxxxxxx" are from my client code.

[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] sending 49 bytes out...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] written 49
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] received command...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] sending 16 bytes on stream 20
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 tx pkt Short dcid=22bd8f18 key_phase=false len=24 pn=100005
[2022-07-04T13:03:49Z INFO  xxxxxxxxxxxxx::tests] Echoing 16 TCP bytes...
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 tx frm STREAM id=20 off=147148 len=16 fin=false
[2022-07-04T13:03:49Z INFO  xxxxxxxxxxxxx::tests] Reading data for echo...
[2022-07-04T13:03:49Z TRACE quiche::recovery] 0000000000000000000000000000000000000000 timer=29.058875ms latest_rtt=285µs srtt=Some(1.309864ms) min_rtt=235.416µs rttvar=446.987µs loss_time=[None, None, None] loss_probes=[0, 0, 0] cwnd=13500 ssthresh=18446744073709551615 bytes_in_flight=498 app_limited=true congestion_recovery_start_time=None Rate { delivered: 4420795, delivered_time: Instant { t: 25958121874594 }, first_sent_time: Instant { t: 25958121874594 }, end_of_app_limited: 100004, last_sent_packet: 100005, largest_acked: 99995, rate_sample: RateSample { delivery_rate: 200000, is_app_limited: true, interval: 285µs, delivered: 57, prior_delivered: 4420738, prior_time: Some(Instant { t: 25958121861455 }), send_elapsed: 0ns, ack_elapsed: 285µs, rtt: 285µs } } pacing_rate=12883016 last_packet_scheduled_time=Instant { t: 25958121918456 } hystart=window_end=Some(0) last_round_min_rtt=18446744073709551615.999999999s current_round_min_rtt=18446744073709551615.999999999s css_baseline_min_rtt=18446744073709551615.999999999s rtt_sample_count=0 css_start_time=None css_round_count=0 cubic={ k=0 w_max=0 } 
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] sending 49 bytes out...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] written 49
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx pkt Short dcid=0000000000000000000000000000000000000000 key_phase=false len=48 pn=92715
[2022-07-04T13:03:49Z INFO  xxxxxxxxxxxxx::tests] Echoing 16 TCP bytes...
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx frm ACK delay=6 blocks=[99996..99996] ecn_counts=None
[2022-07-04T13:03:49Z INFO  xxxxxxxxxxxxx::tests] Reading data for echo...
[2022-07-04T13:03:49Z TRACE quiche::recovery] 0000000000000000000000000000000000000000 packet newly acked 99996
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx frm STREAM id=0 off=147170 len=16 fin=false
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] processed 69 bytes
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] stream 0 is readable...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] reading from stream 0...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] received 16 bytes (fin: false)
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] got 69 bytes
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx pkt Short dcid=0000000000000000000000000000000000000000 key_phase=false len=48 pn=92716
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] 16 bytes sent back through connection
[2022-07-04T13:03:49Z INFO  xxxxxxxxxxxxx::tests] read 16 bytes...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] reading from connection....
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx frm ACK delay=6 blocks=[99996..99997] ecn_counts=None
[2022-07-04T13:03:49Z TRACE quiche::recovery] 0000000000000000000000000000000000000000 packet newly acked 99997
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx frm STREAM id=24 off=147170 len=16 fin=false
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] processed 69 bytes
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] stream 24 is readable...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] reading from stream 24...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] received 16 bytes (fin: false)
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] got 69 bytes
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx pkt Short dcid=0000000000000000000000000000000000000000 key_phase=false len=48 pn=92717
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] 16 bytes sent back through connection
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] reading from connection....
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx frm ACK delay=5 blocks=[99996..99998] ecn_counts=None
[2022-07-04T13:03:49Z TRACE quiche::recovery] 0000000000000000000000000000000000000000 packet newly acked 99998
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx frm STREAM id=28 off=147170 len=16 fin=false
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] processed 69 bytes
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] stream 28 is readable...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] reading from stream 28...
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] received 16 bytes (fin: false)
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] got 69 bytes
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 rx pkt Short dcid=0000000000000000000000000000000000000000 key_phase=true len=48 pn=92718
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] 16 bytes sent back through connection
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] reading from connection....
[2022-07-04T13:03:49Z TRACE quiche] 0000000000000000000000000000000000000000 dropped invalid packet
[2022-07-04T13:03:49Z TRACE xxxxxxxxxxxxx] processed 69 bytes
Lekensteyn commented 2 years ago

Key Update might not be supported, see #1115

adewes commented 2 years ago

Yes I figured that out now @Lekensteyn, seems there's a PR for it. It's a bit of a blocker for me as it causes irrecoverable connection loss once we hit 100.000 packets, so long-lived connections (large downloads) simply cannot work when running against QUIC servers that enforce key updates (e.g. quic-go: https://github.com/lucas-clemente/quic-go/issues/3469).

My solution for now is to patch quic-go to effectively disable key updates, but that's not great of course.

BiagioFesta commented 1 year ago

I had the same issue.

I had another QUIC implementation that needed to communicate with quiche and, because of this bug, I could not have a client-server long-lived connection. So inconvenient.

I've investigated the bug and opened a patch on quiche that implements it, one year ago by now. (Also Note: this is not optional: this should be by a standard implementation).

Patching the "other" implementation to disable key-update is even worse. Key update is there for a (security) reason.

While waiting consideration on my PR, I workarounded my issue by using another cypher-suit for the QUIC my client-server connection (it is in the TLS negotation). TLS 1.3 supports a couple of cypher-suits, and not all of them need AEAD