ShivanKaul / star-spec

IETF Internet Draft for STAR
https://shivankaul.com/star-spec/draft-dss-star.html
Other
3 stars 1 forks source link

Consider batching all reports at the proxy #43

Open bemasc opened 1 year ago

bemasc commented 1 year ago

Reusing OHTTP for STAR is convenient, but it creates two unnecessary weaknesses:

  1. A malicious aggregator who can monitor the relay's network traffic can likely identify the source IP of individual reports by timing.
  2. A malicious aggregator who wants to violate k-anonymity N times can stop their dictionary attack once N has been reached.

If the encrypted reports for each epoch were batched (and shuffled) at the proxy, then network traffic analysis would not reveal anything, and dictionary attacks would have to proceed without any access to the reports, presumably making them less efficient and therefore easier to detect.

chris-wood commented 1 year ago

I think this has been discussed already, but I can't remember where. In any case, there are a couple of questions that come to mind:

  1. How does the proxy actually batch and release reports? If it batches by looking at report tags, then we've just moved the dictionary attack problem from the aggregation server to the proxy. If it batches by looking at encrypted blobs (which would be the case with unreliable OHTTP or whatever), then we're compromising the ability to verify STAR reports in a streaming fashion, making STAR closer to batch-based protocols such as Poplar.
  2. The fundamental assumption about OHTTP is that the gateway (the aggregation server here) cannot observe the relay's ingress link, so I'm struggling to see why this assumption would not carry over to its use for STAR.
  3. Since the dictionary attack only requires one query per target, it's not clear to me that batching helps improve detection, but maybe I'm misunderstanding?
bemasc commented 1 year ago

I think this has been discussed already, but I can't remember where. In any case, there are a couple of questions that come to mind:

  1. How does the proxy actually batch and release reports? If it batches by looking at report tags, then we've just moved the dictionary attack problem from the aggregation server to the proxy. If it batches by looking at encrypted blobs (which would be the case with unreliable OHTTP or whatever), then we're compromising the ability to verify STAR reports in a streaming fashion, making STAR closer to batch-based protocols such as Poplar.

I mean the latter: each OPRF epoch is one big batch.

  1. The fundamental assumption about OHTTP is that the gateway (the aggregation server here) cannot observe the relay's ingress link, so I'm struggling to see why this assumption would not carry over to its use for STAR.

The OHTTP threat model is much weaker than, say, the DAP non-collusion assumption. This issue is about strengthening STAR against certain attackers.

Another variation on this problem is related to aggregators who also have timing information about events that cause clients to generate reports.

  1. Since the dictionary attack only requires one query per target, it's not clear to me that batching helps improve detection, but maybe I'm misunderstanding?

Currently, the attacker can attempt to run a dictionary attack against a specific report. The attacker can stop their attack once it succeeds. (This attack is especially relevant if the dictionary attacker can also overcome OHTTP to link the report to a specific user.)

If the reports are batched by epoch, the attacker can't target individual reports, and doesn't know whether the attack worked at all until the end of the epoch. Attackers might compensate by trying more values, but this is more suspicious to the Randomness Server and more likely to be detected.

chris-wood commented 1 year ago

The OHTTP threat model is much weaker than, say, the DAP non-collusion assumption. This issue is about strengthening STAR against certain attackers.

I don't agree with this. Both OHTTP and DAP require non-collusion for privacy.

Another variation on this problem is related to aggregators who also have timing information about events that cause clients to generate reports.

That's true without batching, but I'm not seeing why this timing information is helpful. Can you elaborate?

Currently, the attacker can attempt to run a dictionary attack against a specific report. The attacker can stop their attack once it succeeds. (This attack is especially relevant if the dictionary attacker can also overcome OHTTP to link the report to a specific user.)

If the reports are batched by epoch, the attacker can't target individual reports, and doesn't know whether the attack worked at all until the end of the epoch. Attackers might compensate by trying more values, but this is more suspicious to the Randomness Server and more likely to be detected.

I'm really not following this. The way a dictionary attack works is as follows:

  1. Attacker queries randomness server using some input and computes the corresponding tag.
  2. Attacker checks tag against reports it sees.

Whether or not the attacker does (2) as reports or streamed in or at the end of an epoch, say, doesn't really seem like a useful distinction. Can you say why you think this distinction matters?

bemasc commented 1 year ago

The OHTTP threat model is much weaker than, say, the DAP non-collusion assumption. This issue is about strengthening STAR against certain attackers.

I don't agree with this. Both OHTTP and DAP require non-collusion for privacy.

OHTTP additionally requires that the target/gateway can't learn the network traffic near the relay. Otherwise it can likely deanonymize requests by timing correlation.

("Unreliable OHTTP" would enable a mitigation of this problem.)

Another variation on this problem is related to aggregators who also have timing information about events that cause clients to generate reports.

That's true without batching, but I'm not seeing why this timing information is helpful. Can you elaborate?

If the aggregator knows when client A sends its reports (e.g. it can trigger the client to send a report), then it can tell which reports are (likely) from client A by timing. This is the same problem as the OHTTP timing attack, but using some application-level event instead of a network tap near the OHTTP ingress.

I'm really not following this. The way a dictionary attack works is as follows:

  1. Attacker queries randomness server using some input and computes the corresponding tag.
  2. Attacker checks tag against reports it sees.

Whether or not the attacker does (2) as reports or streamed in or at the end of an epoch, say, doesn't really seem like a useful distinction. Can you say why you think this distinction matters?

This is basically an argument about attack economics related to risk of discovery.

Consider the following attack:

  1. The collected value is an element drawn uniformly at random from a well-known set of size N.
  2. The attacker uses network interference to prevent all users except one (the victim) from reaching the OHTTP Relay.
  3. The attacker runs an online dictionary attack against the victim's tag until it finds the colliding value, requiring N/2 attempts on average.

In a batch model, this dictionary attack requires N attempts every time to guarantee success. The increased number of attempts makes the attack easier to detect.

chris-wood commented 1 year ago

Thanks for clarifying. Taking a step back, I think it might be helpful if we write down the threat model you have in mind here, as well as what you view the ideal security properties to be with respect to that threat model. Is that something you could take a stab at?

bemasc commented 1 year ago

In short, I am saying that STAR currently does not achieve its privacy goals in the presence of a malicious aggregator who is a Dolev-Yao attacker, or even a Pervasive Passive Attacker. I believe it could tolerate this (i.e. adding these attacks to the threat model) with the change I described.

EDIT: I'm not saying this change is necessarily worthwhile: it increases the cost of operating the proxy, adds more bespoke components, and delays access to the output. I'm just saying it should be considered.

chris-wood commented 1 year ago

In short, I am saying that STAR currently does not achieve its privacy goals in the presence of a malicious aggregator who is a Dolev-Yao attacker

This is making a lot of assumptions about traffic analysis and the like, right?

or even a Pervasive Passive Attacker.

What is a pervasive passive attacker?

bemasc commented 1 year ago

In short, I am saying that STAR currently does not achieve its privacy goals in the presence of a malicious aggregator who is a Dolev-Yao attacker

This is making a lot of assumptions about traffic analysis and the like, right?

Not really. A Dolev-Yao attacker (active network attacker) has enormous flexibility to do things like drop or delay non-target traffic to ensure that the target's traffic is easy to spot by timing behavior.

or even a Pervasive Passive Attacker.

What is a pervasive passive attacker?

I mean what RFC 7258 calls "Pervasive Monitoring": an attacker who can observe (but not modify) network traffic at many locations.

chris-wood commented 1 year ago

Not really. A Dolev-Yao attacker (active network attacker) has enormous flexibility to do things like drop or delay non-target traffic to ensure that the target's traffic is easy to spot by timing behavior.

I see your point now. As an example, you're saying that because the DY attacker can modify with timing, then it can let the aggregation server link encrypted reports to distinct clients and, when decryption succeeds, the aggregation server learns which client sent which value. Thanks for clarifying 👍 I would agree with that assessment.