Potential Attacks Against PCID

martinduke commented 6 years ago

Consider this issue a forum to discuss potential attacks against the PCID algorithm.

The routing bits are secret, which is what protects the server mapping. To crack this, the attacker's first task is to discover the routing bits.

The two possible ways to obtain attack information are to (1) passively observe all (or a portion of) all CIDs coming out of a load balancer or (2) Open a connection and get many CIDs from one server, ostensibly for migration purposes.

There are basically two defenses:

Because the true server ID ("modulus") is combined with an integer multiple of the divisor, there is no readily obvious signature for a single server (e.g.. bits 4, 5, and 7 are always '1' for a given server)
Given the minimum CID length of 8 bytes, there are roughly 10^18 combinations of routing bits. This is a challenge to attackers trying to create statistics about each possible set of routing bits.

martinduke commented 6 years ago

Kazuho raises the point that it is likely that the total set of server id codepoints is not likely to create a precisely uniform distribution of bit settings.

For example if there are three routing bits with server IDs 000 and 010, and a divisor of 3, the codepoints are: 000, 010, 110, 011, 101. If the non-routing bits are completely random, after collecting a large number of connection IDs we'll see that the leftmost bit is 1 40% of time, the second bit 60%, and the third 40% while all the other bits are at about 50%. In this toy example, the routing bits will be clear after only a few observed CIDs.

Some mitigating circumstances:

If all the codepoints XOR together to 111, the statistics are exactly 50%.
With lots of codepoints, and following the recommendation to spread the modulus across the entire number space, the statistics should be close to 50% though not exact. It would take quite a few CIDs to infer the routing mask with any certainty, though we should run the numbers on that.
While the non-routing bits SHOULD "appear to be random", if they are being used to subtly encode something else their probabilities may not precisely equal. In this case, attacker analysis of this type is hopeless.

All that said, at the very least there should be some additional text about these problems. Better yet, some new language providing recommendations. At worst, a conclusion that this is unworkable and abandoning PCID.

martinduke commented 6 years ago

Also, even a single server entering or leaving the pool will cause the statistics to change.

martinduke commented 6 years ago

Upon further discussion with Kazuho, making routing bits xor out to 1s won't even work -- even if you could, which you probably can't, if you make the probabilities even across all CIDs through the load balancer, you almost certainly are not balancing them for every server. So the attacker would just open a connection, get a bunch of new connection IDs, and see the statistics.

A much more profitable approach is to introduce a little weighting in the non-routing bits, which totally frustrates this kind of attack.

dtikhonov commented 6 years ago

We have discussed the observation angle previously on the mailing list.

In the absence of a demonstrable weakness, we should run the numbers and have our findings documented and available online. The question is: how to run the numbers.

martinduke commented 6 years ago

While the consensus in Bangkok was that no one was super enthusiastic about PCID (though @dtikhonov said he would implement it), everyone was comfortable with sending it to the WG and letting them ultimately decide.

martinduke / draft-duke-quic-load-balancers

Potential Attacks Against PCID #20