quicwg / base-drafts

Internet-Drafts that make up the base QUIC specification
https://quicwg.org
1.63k stars 205 forks source link

Spin bit should be applied per each 5-tuple rather than per connection #1828

Closed kazuho closed 5 years ago

kazuho commented 6 years ago

Section 2.2 of the spin-bit draft states that "each endpoint, client and server, maintains a spin value, 0 or 1, for each QUIC connection."

However, this does not work when both endpoints try to coalesce multiple QUIC connections onto a single UDP port.

In such case, multiple connections will be sharing the same 5-tuple.

Since an observer cannot track each connection1, the spin bit needs to be applied for that 5-tuple so that it would be an useful signal to the observer (who does not have the ability to track each connection across CID changes).

[1] Although the initial pair of CIDs is observable, it becomes impossible to track the client's CID and the server's CID that maps to the same connection once the CID changes for the connection.

ianswett commented 6 years ago

Agreed, this should definitely be per-path.

larseggert commented 6 years ago

"Per-path" would be based on the IP address pair, which is not the same though

mikkelfj commented 6 years ago

Different processes can connect on the same IP pair.

kazuho commented 6 years ago

Different processes can connect on the same IP pair.

Or diffrent machines when a NAT or a L4 load balancer is present.

Anyways, I assume that we agree that the identifier needs to be the 5-tuple (setting aside what “path” means).

mikkelfj commented 6 years ago

Actually two bridges could route traffic between two cloud private networks over a fixed 5-tuple. I'm not sure it is safe it mix any information across connections.

kazuho commented 6 years ago

Actually two bridges could route traffic between two cloud private networks over a fixed 5-tuple. I'm not sure it is safe it mix any information across connections.

It is impractical (if not impossible) for an uncoordinated middlebox to coalesce multiple QUIC connections onto a single 5-tuple, even in case both endpoints use non-zero-length CIDs.

This is because, as I have stated, it is impossible for such a middlebox to track a connection across CID changes. Consider the case where a NAT is coalescing multiple QUIC connections coming from more than one client machine. When a CID change is initiated for one of the connections by a server, the NAT cannot determine to which client it should forward the packet that contains the new CID.

Therefore, my view is that it is safe for an observer running on the Internet to consider multiple QUIC connections using one 5-tuple to be a communication between two machines, and that it is safe to expect the Spin bit state to be shared among such connections.

FWIW, my expectation is that it would not be uncommon to see multiple QUIC connections coalesced onto one 5-tuple (at least that will happen when H2O/quicly is run as an edge server), and that spin bit design should take such deployments into consideration.

mikkelfj commented 6 years ago

You are making assumptions about the bridges not understanding the information encoded in the CID's.

mikkelfj commented 6 years ago

Also, in the bridge scenario, it is not likely that the CID will change.

kazuho commented 6 years ago

If the bridge can decrypt the information encoded in CID, it would mean that the bridge is not an "uncoordinated" middlebox.

What I am pointing out is that it is impossible for an uncoordinated middlebox to coalesce QUIC connections. Such middlebox (i.e. NATs) will be far more common than coordinated middlebox.

It is also my understanding that most endpoints running on the Internet will support CID change.

Considering these aspects, my argument is to design spin bit for the cases where such uncoordinated middleboxes exist and CIDs change mid-connection, rather than for the cases where you would have coordinated ones.

mikkelfj commented 6 years ago

As long as it is clear that those assumptions are present.

martinthomson commented 6 years ago

Parking based on the resolution of #631

britram commented 6 years ago

unparking together with #631

britram commented 6 years ago

Agree with @kazuho, this should be per 5-tuple for the multiplexing case. However, it's not clear what the best way is to reconcile that with our desire to try to reduce CID linkability on migration by making the spin bit reset state (and indeed, even reset spin-participation state) on CID change.

kazuho commented 6 years ago

Based on the offline discussion at Bangkok, my understanding is that we have two possibilities; either suggest that:

The latter gives the observers better chance of seeing the signal, at the cost of being required to determine the CID that actually spins.

Assuming that we would have enough connections that are spinning, it might also be safe to assume that many connections will not be coalesced onto a single 5-tuple. If that is the case, the former would be sufficient.

(FWIW, I am not sure what the linkability properties would be)

martinthomson commented 6 years ago

You could spin them all together. Detecting an edge across all connections based on an increase in largest received packet number in each should work.

Keep in mind that a single connection can use multiple connection IDs on the same path, switching between them arbitrarily. That could appear to the path to be multiple connections sharing a path. Spinning multiple connections together would appear to be no different.

erickinnear commented 6 years ago

That could appear to the path to be multiple connections sharing a path. Spinning multiple connections together would appear to be no different.

This seems to make sense to me -- do we spin per CID or per 5-tuple? That seems to be the main question in my mind.

ianswett commented 6 years ago

I'm not sure per 5-tuple is actually implementable in most environments. Servers in particular would have a hard time coordinating the right value.

And the algorithm of latching the value to reflect to the largest packet number's value is undefined if there are a multiple packet number spaces.

So I think we need to do per CID?

martinthomson commented 6 years ago

I agree that per-path is not feasibly implemented. We'd have to acknowledge the possibility that multiple connections that share a path are not required to coordinate.

mikkelfj commented 6 years ago

Yes definitely - two connections should NOT be required to coordinate.

EDIT: yet again I missed a "not"

britram commented 6 years ago

I tend to agree with @ianswett here -- while different CIDs may often represent connections between the same processes (and therefore with similar e2e delays at a given instant in time), this is not guaranteed to be the case. Is per-CID spin a useful passive measurement signal in the general case, though?

Stepping back, there seem to be two broad usage patterns with partially conflicting requirements here:

Per-5-tuple spin is clearly more useful to the path -- otherwise, on-path devices has to use heuristics to guess the CID (node, this is not very hard if it has a few packets, but it needs to keep a header buffer up to max CID length until it locks on, and it can use heuristics about well-known servers/CDNs/configurations to make an initial guess that will be correct almost all of the time -- in any case, any on-path measurement device offering RTT information will also probably offer flow information). But it seems impossible to do for those cases where we care about reducing spin state contribution to rebinding linkability.

mikkelfj commented 6 years ago

please note my above edit - connections should not need to coordinate.

@britram:

on your first point: if the client is potentially mobile and actively moving to a new IP, it MUST have a non-zero CID length. A stationary client can of course still rebind via NAT but then zero-length is valid.

on your second point - per CID spin-bit multiplexing: it may be acceptable for single connection, but not over multiple connections multiplexing because the entity that produces packets may be running independently in different logical processes or even on isolated hardware in extreme cases.

I agree on your last point the linkability is unrealistic to avoid in praxis. The consensus thinking is, however, that it should be attempted to the extend possible. While I disgree, this appears to be the working assumption.

I think spin bit per CID as as easy as 5-tuple because you need a hashed entry in either case to track state. A spin bit per connection should be impossible if anti-linkage is working. Of course with respect to reporting any operational issues, a 5-tuple is more useful than a CID, but the hashed entry can include path information.

The main problem with spin bit visiblity is a specific end-user standing out as the only one not contributing with spin bits.

erickinnear commented 6 years ago

I think I agree with @mikkelfj that we shouldn’t require connections to coordinate. Given that CID is the identifier that we use/is most granular, it does make spinning per CID seem attractive. Although I am sensitive to @britram’s question about the signal that this generates...

britram commented 6 years ago

@erickinnear I spent a little time thinking about how an observer generating RTT samples from a per-CID spin would work, and while it’s slightly messier than using 5-tuples as a flow key, it’s not at all hard. So I don’t think there are any concerns about signal quality.

I’ll put together a PR for language about this in the management draft when I get a moment.

kazuho commented 6 years ago

I agree that from endpoint's perspective spinning per connection is the easiest thing to do.

From observer's perspective, it becomes hard to track which client CID is associated to which server CID when multiple CIDs rotate simultaneously, but all you lose in that case is the capability to observe the distance between the observation point and the endpoints. The RTT of the connection would still be observable.

martinthomson commented 6 years ago

I don't think that we need fancy here. Simply state that - ideally - spinning is scoped to a 5-tuple. You might then observe that if connections share the same path, then they might not be able to coordinate spinning.

Aside from it being logistically difficult to coordinate across connections, coordination isn't possible in all cases. With diverse endpoints, there could be NATs/load-balancers/middleboxes that know how to multiplex and demultiplex based on connection ID at both ends. That makes @britram's thinking useful (...minimum publishable unit?). Of course, if spinning could be coordinated, that would make the signal cleaner.

Note that multiple connections on the same path is hard to distinguish from the same connection with alternating connection IDs, unless the end-to-end latency differs. (Note to self: look into gaming this.)

mirjak commented 5 years ago

Is the assumption of the current discussion that the connection ID is present/non-zero? I guess if no connection ID would be present a per-CID spin signal would look quite random from the network, no?

martinthomson commented 5 years ago

Closed by #1982.