Expose remote_cid() and local_cid() on quinn::Connection

billylindeman commented 7 months ago

This adds some functions to expose the local/remote CID on a connection handle. This allows a server to extract CIDs that were generated by a quinn client with a custom ConnectionIdGenerator

Ralith commented 7 months ago

I'm hesitant to expose these, because they aren't generally useful to well-behaved applications. CIDs are transient and routinely change mid-connection, and especially a remote CID (i.e. the one used for outgoing packets) is not intended to be intelligible to the local endpoint, much less the local application.

djc commented 7 months ago

These are actually the handshake CIDs, so at least they don't change, right?

Ralith commented 7 months ago

In a literal sense, yeah, but they also don't have a predictable relationship to the CIDs of post-handshake packets, which makes their usefulness limited.

djc commented 7 months ago

@billylindeman so can you reiterate your use case/rationale for this? What problem does this end up solving that can't (easily) be solved another way, as suggested by @Ralith on Discord?

you should be storing data in/interrogating the local connection ID, because that is the only one that your load balancing infra logically controls, and the only one attached to every incoming packet. The idea is that your load balancer decides where to route the connection, then encodes the CID and passes it to the peer to use on future packets.

We don't expose CIDs from the connection because they're not expected to be useful at that layer. If you want application layer state, you can just store that in memory without having to snuggle it through the CID. CID data is for when you need to communicate with your own stateless packet-layer infra. in short, if you have a Quinn Connection object, you're far past the point where CIDs solve a problem

billylindeman commented 7 months ago

In an application that chooses to encode some information into the connection id (such as using a quinn_proto::ConnectionIdGenerator), there is no corresponding way to retrieve that information in a server using quinn. I think it's odd to allow custom id generation in the client, but not allow exposing the ID in the server. My use case involves encoding correlation data for multiple quic connections in the client, so that they end up at the same server, and then in my server using that data to correlate the inbound connections to the same session.

Ralith commented 7 months ago

there is no corresponding way to retrieve that information in a server using quinn

Normally, CIDs are opaque except to the party that generated them, so accessing CIDs from the application layer doesn't make sense: either you're the party that generated a CID, so you already have that information, or you aren't, and you can't get any information from it. The purpose of a CID in QUIC is to communicate with stateless packet-layer infrastructure under the control of the same party that generated the CID: server CIDs might be read by a server's load balancers, and client CIDs might be read by a client's NAT.

My use case involves encoding correlation data for multiple quic connections in the client, so that they end up at the same server, and then in my server using that data to correlate the inbound connections to the same session.

This gives the client direct control over the server's load balancing infrastructure, and fails to include routing information in packets on established connections. This is unusual due to a number of drawbacks, e.g.:

A malicious or buggy client population could easily bypass load balancing and overwhelm a targeted server
A man-in-the-middle might be able to spoof your correlation data, interfering with another user's session
It's much harder to dynamically adjust load balancing to account for actual load
Load balancing must be stateful, which costs more resources and complexity and is more error-prone

Are those compromises suitable for your application?

You can avoid these problems by encoding your correlation data in the server-generated connection ID instead. As we discussed offline, there's a solution that avoids these hazards: establish a single connection at first, then issue address validation tokens from the server for use in the connections that should be correlated. These tokens can encrypt the correlation data, which can then be decrypted when the connection attempt is received, and re-encrypted into the resulting connection ID. This would require some small extensions to Quinn to expose address validation tokens, and introduce a small amount of latency on establishment of secondary connections the first time a client connects.

If the above drawbacks are more palatable than a slight increase to connection establishment latency, then your original concept could be most gracefully implemented by passing information from the client to the server in the initial destination CID. You could still allow for stateless load-balancing by re-encoding this information into the server CIDs that are issued in response. I'm open to supporting this case (we'll want to build on #1752 to do it right, but as a quick hack we could pass the IDCID into the generator explicitly) if you're sure those are the tradeoffs you want.

Ralith commented 5 months ago

Closing for lack of activity, and because more graceful alternatives have been presented here and elsewhere. Feel free to open a new issue or PR to discuss if the best path forward still seems unclear.

quinn-rs / quinn

Expose remote_cid() and local_cid() on quinn::Connection #1755