For example, if an application is on a path with a 1500 byte MTU and is sending large numbers of 1200 byte datagrams (e.g. tunneling traffic from a non-MTU-aware protocol), up to 25% of bandwidth will be spent padding 1200 byte QUIC packets to 1500 bytes to allow them to be sent in GSO batches with a predictable segment size. This is an unreasonably high overhead for a realistic scenario, so we should find ways to reduce the amount of padding used.
End the GSO batch whenever continuing it would require an excessive amount of padding.
On its own this would prevent GSO from operating entirely in the above scenario, which isn't ideal. It's not obvious what "excessive" should mean, but it probably doesn't need to be precise.
Additionally, set the GSO segment size to the length of the first packet in the batch, rather than the MTU.
Combined with the above, this allows GSO to optimize flights of packets of any uniform size. Increases in size within a batch can be avoided by using the GSO segment size, rather than the MTU, as the limit for individual packet size in packet assembly.
An unusually small first packet in a flight might lead to wastefully small packets in the remainder of the flight when traffic is heterogeneous. It's not obvious if this is a meaningful risk in practice: if a small datagram is followed by a datagram too large to fit into the same packet, then the large datagram wouldn't fit within the small segment size either, so the GSO batch would naturally terminate. If a small datagram is followed by stream data, the stream data can be fragmented and the segment size needn't be small to begin with.
For example, if an application is on a path with a 1500 byte MTU and is sending large numbers of 1200 byte datagrams (e.g. tunneling traffic from a non-MTU-aware protocol), up to 25% of bandwidth will be spent padding 1200 byte QUIC packets to 1500 bytes to allow them to be sent in GSO batches with a predictable segment size. This is an unreasonably high overhead for a realistic scenario, so we should find ways to reduce the amount of padding used.
https://github.com/quinn-rs/quinn/blob/5f63104b6a1a843d7627a13e491b24228eb0312f/quinn-proto/src/connection/mod.rs#L664-L666
Directions to explore:
An unusually small first packet in a flight might lead to wastefully small packets in the remainder of the flight when traffic is heterogeneous. It's not obvious if this is a meaningful risk in practice: if a small datagram is followed by a datagram too large to fit into the same packet, then the large datagram wouldn't fit within the small segment size either, so the GSO batch would naturally terminate. If a small datagram is followed by stream data, the stream data can be fragmented and the segment size needn't be small to begin with.