Increasing K creates a privacy risk when timing correlations are present

bemasc commented 12 months ago

K-Check is designed to improve in ~privacy~ resistance to active attack as K (or the threshold t) increases. However, if a timing correlation is present (e.g., the client fetches an OHTTP config immediately before using it), increasing K ~actually reduces privacy~ increases vulnerability to a passive attack on the user's privacy, because any single colluding mirror can reveal the client IP to the gateway. Adding more mirrors increases the likelihood that one of them is malicious.

If timing attacks are present in the use case, I think the optimal value of K is likely to be 1.

In some use cases, timing correlations are present for some requests (e.g., the first request issued when the configuration is not locally in cache) but not others. In these cases, it might make sense to perform a single check (K=1) initially, and then perform more checks asynchronously (according to some randomized schedule) to catch if the initial mirror was colluding and served a targeted resource.

These issues can be avoided by tunneling K-Check through a trusted proxy, but if a trusted proxy exists then it can run the Mirror Protocol itself and K > 1 is unnecessary (see #16).

chris-wood commented 11 months ago

K-Check is designed to improve in privacy as K (or the threshold t) increases

I don't think this is a goal of K-Check. Increasing K is meant to increase confidence in consistency of the answer -- it has nothing to do with privacy. Does the draft say otherwise? If so, we should probably fix that.

bemasc commented 11 months ago

I've adjusted the issue description to be more precise, and avoid assuming that privacy is the only reason to pursue consistency.

chris-wood commented 11 months ago

I'm not sure I understand this newly phrased issue. The "attack" is still described as one on privacy ("increasing K actually reduces privacy increases vulnerability to a passive attack on the user's privacy"). I appreciate thinking about the draft, but it would be more helpful if this was something concrete.

bemasc commented 11 months ago

OK, I'll try to make it concrete. A hypothetical example:

I am the moderator of a subreddit. I'm trying to maintain my privacy, so my login handle is not identifying. Once every day or two, I publish a comment in the subreddit using OHTTP through a trustworthy Relay.

A hostile actor would like to know my IP address. Reddit doesn't know, and the Relay won't share. However, Reddit's gateway config expires after 60 minutes, so almost every time I post an update, my client app needs to refresh its consistency check first. That requires pinging K different mirrors, asking for Reddit's gateway config. Let's say K=10.

It turns out that one of those 10 mirrors is compromised by the hostile actor. By correlating requests for the gateway config with my post timestamps, it can identify the IP address that always asked for this resource immediately before my posts appeared. That's my IP address.

The risk of one mirror being compromised increases in proportion to K. Similar proportional risks apply to correlation attacks using passive network monitoring, which become more likely as your requests traverse more paths.

chris-wood commented 11 months ago

OK, so this is an attack on privacy 👍 We can certainly note that the probability of such a thing increases as K increases.

bemasc commented 8 months ago

This is resolved by #19.

ietf-wg-privacypass / draft-ietf-privacypass-consistency-mirror

Increasing K creates a privacy risk when timing correlations are present #14