microsoft / net-offloads

Specs for new networking hardware offloads.
MIT License
27 stars 3 forks source link

[QEO] Uniquely identifying connections #60

Open mtfriesen opened 1 year ago

mtfriesen commented 1 year ago
  1. Which fields in NDIS_QUIC_CONNECTION uniquely identify a connection? UdpPort , ConnectionIdLength, and ConnectionId? What about Address?
  2. What assumptions can various layers of the stack make about uniqueness? Can NIC drivers assume there will be no duplicate connections?
mtfriesen commented 1 year ago

For question (2), since these connections are being plumbed via direct OIDs, intermediate components like XDP (or any other LWF) simply cannot reliably synchronize with the state in the rest of the stack.

rawsocket commented 1 year ago

ConnectionID could be empty and rely on ephemeral client port. If server socket is not connected, the connection will need to use source and destination L3/L4 information to match the connection on the transmit. It is also expected that the physical NIC will not match flows from L3/L4 information or connection ID in the packet header and would rely more on the crypto key index provided with the data (skbuff in Linux, or NDIS packet in Windows). The method to uniquely identify a connection would be different between transmit and receive. On receive the NIC will need to match the flow.

nibanks commented 1 year ago

Based on discussion in https://github.com/microsoft/quic-offloads/issues/57, I agree that we can generally eliminate any identification logic for the TX path, because we'd pass the opaque handle down with the packet.

So the real question is on the RX path, how does the NIC identify a connection? I added some pseudocode in the past here, but there have been other discussions, such as in https://github.com/microsoft/quic-offloads/issues/27, to try to simplify things to reduce the complexity (i.e. requires a two-pass lookup right now).

mtfriesen commented 1 year ago

There are several questions here.

What assumptions can various layers of the stack make about uniqueness? Can NIC drivers assume there will be no duplicate connections?

The rest of the stack needs a clear specification if it is obligated to avoid duplicate entries, else it should be clear that only the NIC is responsible for arbitration.

nibanks commented 1 year ago

Can NIC drivers assume there will be no duplicate connections?

Yes, it should assume no duplicates. Doing a "set" on an existing connection is meant to update it, say to rev the KeyPhase.

The rest of the stack needs a clear specification if it is obligated to avoid duplicate entries, else it should be clear that only the NIC is responsible for arbitration.

Really, the arbitration problem exists at the tuple layer, and really that must already be handled independently of QEO, right? Different apps must use different tuples (or something like CIBIR to differentiate).

mtfriesen commented 1 year ago

That's where things get tricky. We're specifying that we're using direct OIDs, and also that the layers above the NIC will not send duplicate OIDs down the stack. The nature of the NDIS design today precludes synchronization of LWFs with the rest of the stack on the direct OID path; therefore either the LWFs must communicate with a OS-wide arbitration component, otherwise either LWFs cannot issue this OID or the NICs must not assume a coherent upper layer.

nibanks commented 1 year ago

will not send duplicate OIDs down the stack

Just to clarify, by "duplicate OIDs" you mean OIDs for the same connection from different parties? If that's the case, then the fact that we already have designs to leverage the OS port pool to arbitrate tuple usage, I think we should be fine. Is that not the case?

rawsocket commented 1 year ago

It is common in modern NICs to generate hash based on L3/L4 headers. For QUIC, the uniqueness of the situation is that the connection ID length is not sent with the packet. This creates a complication in the packet processing - it requires to run an extra lookup.

When the connection ID is empty - there could be only one connection with an empty connection ID for the given quadruple of <srcip, srcport, dstip, dstport> to lookup. For the other cases - the connection ID between a pair of ports could be anything, and of different length.

There is a limitation by vendors to make connection ID either empty or fixed in value. This, while makes things simpler on the Rx pipeline and eliminates extra lookup and memory for extra tables, it limiting what could be done on the service side.

Currently, only Broadcom is known to support QUIC in future generation NICs. The interface between kernel and user space has been a subject of collaboration for some time and generally is aligned with no-lookup Tx and flow match Rx. Although, the connection ID is a subject for further consultations between the industry and the vendor.

nibanks commented 1 year ago

Another option is to burn a few bits in the connection ID to dedicate to CID length encoding. I'm not a big fan of that approach, but we should at least consider all options.

rawsocket commented 1 year ago

Large datacenter based use cases of QUIC use full 20 bytes long CID to store routing information and other what-nots in there, which is totally allowed by the standard as these bytes neither are monotonic nor assigned to anything special. If the approach is taken to map to a fixed number - it should be a maximum. Even though limiting it is something we should really try to avoid, if possible.