quinn-rs / quinn

Async-friendly QUIC implementation in Rust
Apache License 2.0
3.79k stars 388 forks source link

ACK Receive Timestamps #1988

Open sterlingdeng opened 1 month ago

sterlingdeng commented 1 month ago

Hi quinn maintainers, this issue contains a proposal and details to implement packet receive timestamps^1.

Motivation

We would like to implement the GCC^2 congestion control algorithm, commonly used for WebRTC, as a controller implementation in quinn. The GCC algorithm is both loss and delay based and require measurements in order to provide an estimated bitrate for the network link. ACK's currently provide enough information to the loss based controller because a loss can be measured by any skips in packet numbers, but the current design of ACK's lacks adequete information for the loss based controller. In the WebRTC domain, GCC is commonly used with TWCC^3 an RTP/RTCP extension that provides feedback on the arrival time and sequence number of RTP packets. The Receive Timestamp draft^1 is a QUIC extension that's analogous to TWCC.

The draft also contains a motivation section. Meta also seems to have their own implementation thats based off of this draft and presented^4 their work to the IETF 118 meeting in Nov of 2023.

Implementation

This section contains the components and their implementation details needed to implement this feature. It would be useful to read through the proposal first to familiarize oneself with how it works.

Update TransportParmeters

TransportParameter Description Code
max_receive_timestamps_per_ack Limits the number of timestamps per ACK frame tbd
receive_timestamp_exponent Trades off precision of time for space. Exponent of 0 indicates microsecond precision. tbd

Question

The transport parameter values are not determined. How should we handle this? An option is to use a sufficiently large value that we believe there shouldn’t be a collision in a reasonable time if the official values are incremented sequentially.

Update AckFrame

Update the AckFrame to include the timestamp information.

pub struct Ack {
    pub largest: u64,
    pub delay: u64,
    pub additional: Bytes,
    pub ecn: Option<EcnCounts>,
   +pub timestamps: Option<Bytes>,
}

Timestamp type chosen to be Bytes because it's handled similiar to the additional field, where an iterator is used to decode those bytes. In the additional case, the iterator returns an acknowledged packet number, in the timestamp case, the iterator returns the packet number and receiver receive time for that packet.

Another option is to create a separate struct dedicated for Ack with timestamps so we wouldn't have to handle the Option on timestamps.

Encode and decode to/from wire format

The structure of the ACK_RECEIVE_TIMESTAMP is below

ACK_RECEIVE_TIMESTAMPS Frame {
  Type (i) = TBD // See comment
  // Fields of the existing ACK (type=0x02) frame:
  Largest Acknowledged (i),
  ACK Delay (i),
  ACK Range Count (i),
  First ACK Range (i),
  ACK Range (..) ...,
  // Additional fields for ACK_RECEIVE_TIMESTAMPS:
  Timestamp Range Count (i),
  Timestamp Ranges (..) ...,
}

Timestamp Range {
  Gap (i),
  Timestamp Delta Count (i),
  Timestamp Delta (i) ...,
}

For encoding, I plan to implement an Ack::encode_timestamps helper function to encapsulate all the timestamp encoding logic in one place. For the decoding design, I plan to implement an AckTimestampDecoder struct that analogous to the AckIter iterator. The AckTimestampDecoder is a struct that implements the iterator trait that allows the caller to get the packet number and timestamp, in decreasing order. The main motivation of this design is to keep this iterator consistent with AckIter.

Below is pseudocode for the proposed behavior.

timestamps: Vec<(packet_number, Instant)> = [(1, 10), (2, 20), (3, 30), (8, 80)]
let buf = bytes::BufMut::new()
Ack::encode_timestamps(&timestamps, &buf)

let decoder = AckTimestampDecoder::new(&buf)

print(decoder.collect());
// High to low because thats how it's encoded in wire format.
// [(8, 80), (3, 30), (2, 20), (1, 10)]

Question Similar to the TransportParameter codes, the ACK_RECEIVE_TIMESTAMP Type is not specified. We would need a placeholder of some value in the interim.

Implement ReceivedTimestamps data structure

ReceivedTimestamps is a Vector of monotonically increasing packet numbers that's used to store the packet number and time of a received packet.

pub struct PacketTimestamp {
    packet_number: u64,
    timestamp: Instant,
}

pub struct ReceivedTimestamps(Vector<PacketTimestamp>);

The draft proposal includes a section that states that out of order packets can be ignored.

Note that reporting receive timestamps for packets received out of order is not supported. Specifically: for any packet number P for which a receive timestamp T is reported, all reports for packet numbers less than P must have timestamps less than or equal to T; and all reports for packet numbers greater than P must have timestamps greater than or equal to T.

If an incoming packet has a lower packet number than the packet number of the last value on the vector, it means that the incoming packet is out of order and the time it was received can be ignored.

Converting between timestamps to/from wire format to std::time objects

The absolute timestamp is not directly encoded into the ACK frame, but rather the delta is encoded and compared to a basis. This design helps optimize for space, reducing the need to use 64-bits to encode the NTP time, or the middle 32-bits of NTP, which is what other protocols do.

For the first timestamp delta of the first timestamp range, the received timestamp can be determined by adding the delta to the timestamp basis, which was negotiated via the transport parameters. Subsequent received timestamps can be computed by taking the difference between the last calculated received timestamp and the delta value.

For the GCC system that leverages the timestamps data, that system is only interested in the delta between the timestamps, rather than the absolute value of the timestamp. This use case seems like the exact problem time::Instant was designed to solve. Therefore, whenever time type is needed for the implementation, the time::Instant type is used. A downside of using time::Instant is that it does not print into a human readable time, but that should be a reasonable trade-off anyways because the timestamp delta is based off of an arbritrary basis anyways.

When a connection is created, a new time::Instant will be created and used effectively as the Instant for the receive_timestamp_basis value. Both will never change for the lifetime of the connection, and all calculations done on them will be based on deltas. For calculations, we will use the time::Instant::duration_since method or time::Instant + time::Duration.

The pseudocode below describes how the conversion from Instant to the time delta wire format.

// Contains the timestamp configuration used on a connection.
// This is expected to be a field on the Connection struct
struct AckTimestampConfig {
    // from transport parameters
    exponent: u64,
    max_timestamps_per_ack: u64,
    // set when the connection is created
    basis: std::time::Instant
}

// Encoding
basis = cfg.Instant // the fixed Instant created when the connection was created.
for pn, recv_time in timestamps.reverse():
    if first:
        delta = recv_time.duration_since(basis).as_micros()
    else:
        basis.duration_since(recv_time).as_micros();
    buf.write_var(delta * (1 << exponent))
    basis = recv_time
    first = false

// Decoding
self.basis = basis_from_transport_params
self.instant_basis = Instant
self.first = true
if self.first:
    self.timestamp = delta * (1 << exponent)
    self.first = false
else:
    self.timestamp -= delta * (1 << exponent)

yield self.instant_basis + Duration::from_micros(self.timestamp)

Include time::Instant when adding received packets to PendingAcks

Within Connection::on_packet_authenticated and space.pending_acks.insert_one we can add the packet to the ReceivedTimestamp object. This would mean that the ReceivedTimestamp would live on the PacketSpace struct.

pub(super) struct PacketSpace {
  +pub(super) received_timestamps: ReceivedTimestamps
}

It makes sense to put the ReceivedTimestamps object on the PacketSpace because ACK’s are grouped by the packet space.

Receiving and handling Ack Frames with timestamps.

When the peer sends an ACK frame with timestamps, we do the following:

Some pseudocode:

pub(super) struct SentPacket {
    /// The time the packet was sent.
    pub(super) time_sent: Instant,

    /// The time the packet was received by the receiver. The time Instant on this field is
    /// relative to a basis negotiated by the two connections. Time arithmetic done using the
    /// time_received field is only useful when compared to other time_received.
   +pub(super) time_received: Option<Instant>,

    // snipped
for packet in newly_acked.elts() {
    if let Some(mut info) = self.spaces[space].take(packet) {
        // snipped
        if let Some(ref mut timestamp) = timestamp_iter {
            while let Some(peeked) = timestamp.peek() {
                match peeked.packet_number.cmp(&packet) {
                    cmp::Ordering::Less => {
                        let _ = timestamp.next();
                    }
                    cmp::Ordering::Equal => {
                        // Unwrap safety is guaranteed because a value was validated
                        // to exist using peek
                        let ts = timestamp.next().unwrap();
                        info.time_received = Some(ts.timestamp);
                    }
                    cmp::Ordering::Greater => {
                        break;
                    }
                }
            }
        }
        self.on_packet_acked(now, packet, info);
    }
}

Investigate current behavior of quinn_proto::congestion::Controller

I believe that the current implementation of the congestion::Controller may contain a bug that would lead to some incorrect calculations, but is likely ignored by the controller implementations. The bug exists on any method on the controller that surfaces any packet number information. Packet numbers can be reused within separate packet number spaces and based on the existing implementation, it's possible for the controller to see the same packet number more than once. A simple example is when packet number 0 of space Handshake is sent, and then packet number 0 of space Data. The on_sent method will be called twice with packet number 0.

What we want is for the congestion control measurements to be unified across the spaces and not segmented by space.

Congestion control and round-trip time (RTT) measurement are unified across packet number spaces.^6

A fix would be to implement a solution that provides the congestion controller a packet number counter that spans all packet spaces.

I validated my expectation by println'ing the packet number and spaces here.

Extend quinn_proto::congestion::Controller to surface timestamp info

The main consideration for adding a new method instead of modifying existing ones is to prevent a breaking a change on the public API.

on_acknowledgement: called for each acknowledgement packet received, similiar to the API for on_ack but includes the received field.

pub trait Controller: Send + Sync {
  fn on_acknowledgement(
    &mut self,
    // A packet number thats unified across all space ids.
    packet_number: u64,
    // The time the packet was sent.
    sent: Instant,
    // The time the receiver received the packet. Instant is based off of a negotiated
    // time basis.
    received: Option<Instant>,
    // The size of the packet.
    bytes: u64,
    app_limited: bool,
    rtt: &RttEstimator,
  )
}

[Feature] or not

The implementation above could be executed without the need of a feature flag. What are your thoughts on that? Should we use the rust [features] to prevent this code from getting compiled if its not enabled?

Interactions with Ack with ECN (0x03)

Based on some local testing I did between 2 quinn endpoints, it appears that all of the ACK frames that are sent between the two are type 0x03, the ACK frame with ECN data. Because the draft proposes the receiver timestamp as a completely separate type that does not encode any ECN data, communicating ECN data and receiver timestamps are mutually exclusive, you can choose one, but not both. In this implementation, if the receiver timestamps feature is enabled, sending an ACK packet with timestamp data will take precedence over sending the ACK frame with ECN data.

Open Questions

Ralith commented 1 week ago

Haven't looked at the PR yet, but initial thoughts from this overview:

Transport parameter and frame IDs should be reserved through the IETF before merging; use whatever you like for private testing. If it's too early to reserve values it's probably too easy to include in a public, stable QUIC implementation, but I understand the bar for reserving a value to be very low. Maybe Meta already has values picked out?

the received timestamp can be determined by adding the delta to the timestamp basis, which was negotiated via the transport parameters

According to the draft you cite, the basis is not a transport parameter, or communicated at all. The extension seems to only communicate inter-packet delays, not absolute time.

A fix would be to implement a solution that provides the congestion controller a packet number counter

This is tricky because packets can be interleaved between spaces, complicating conversion from actual packet numbers (as in an ACK frame) to this counter. We could use a shared packet number space to begin with (never reusing packet numbers between spaces), or just pass the space down to the controller to disambiguate.

Extend quinn_proto::congestion::Controller to surface timestamp info

Because the basis is not communicated, send/receive times are not comparable. This should be represented in the type system with a distinct type for received timestamps.

Should we use the rust [features] to prevent this code from getting compiled if its not enabled?

Cargo features make it harder to ensure test coverage. Unless including a capability is very costly (e.g. in compile time or code size) it should always be compiled.

communicating ECN data and receiver timestamps are mutually exclusive

This seems like a major flaw in the draft. Maybe work with the author/propose a revision to address this gap?

Add a new method on the Controller interface to prevent a breaking change?

A provided method that delegates to the existing one by default would work well. We may want to make a breaking release by the time this is all settled anyway, but we can combine them back down then if desired.