Insecure timestamps allow replay attacks across epochs

csharrison commented 2 years ago

In the current doc timestamps for each event are provided by the report collector with no client-side attestation. This allows a report collector to replay old events by modifying the timestamp accordingly. This attack doesn't necessarily break the desired differential privacy protection, since we'd still bound the impact of any one event per epoch, but it allows some "worst case" style attacks where one sensitive event could get queried again and again every epoch.

It might be the case that we accept this kind of leakage, but I think it's worth a discussion of whether we'd want to restrict in some way events from the client from participating in unbounded epochs. Note that doing this might break legitimate use-cases ("lookback windows" for attribution that are longer than an epoch), so we should proceed carefully.

benjaminsavage commented 2 years ago

Thanks for filing this issue @csharrison!

We have been discussing this issue, and our "IPA end-to-end" document is trailing a bit behind our current thinking.

A few thoughts:

I agree that continuing to allow a report collector to re-query the same source <=> trigger pair over and over again each epoch is not desirable, and IPA should not allow this.
The current document does not involve any kind of attestation for either the timestamp or the epoch. This seems like something we might need to change. Minimally, the user-agent could attest to the epoch.
In terms of legitimate use-cases related to "lookback windows" - here's how I'm currently thinking about it:
1. For "Trigger Fanout Queries", that is where an advertiser has some conversion events (trigger events) and they want to understand what ads drove them (which source events) - I think we can potentially support long-lookback windows without enabling this "worst case" style situation. For starters, let's imagine that the "report collector" specifies an "epoch" argument when making an IPA query. This argument would let the helpers know which epoch's privacy budget this query is meant to consume. Furthermore, assuming we have attestation for the "epoch", we could simply ask the helpers to validate that all of the trigger events come from the epoch that was specified in the query. It would be OK to allow "source events" from preceding epochs (within some range - that would be limited by how often the helpers rotate their encryption keys, TBD. I know in the "auto" vertical advertisers like 90 day lookback windows - so that would be the aim - let's see if we can support that)
2. For "Source Fanout Queries", that is where an app or website that displays ads wants to compute some kind of "calibration" or "experimentation" use-case, counting attributed conversions across all advertisers, I have less clarity. Perhaps you have a suggestion! Perhaps we could just do the inverse - that is, validate that all of the "source events" come from the epoch specified in the query. This would require the Report Collector to "reserve" some privacy budget for a few weeks post the end of an epoch if they wanted to try to do a long lookback window query, then run a query to use that reserve budget up after they've received all of the potentially matching trigger events.

csharrison commented 2 years ago

In terms of legitimate use-cases related to "lookback windows" - here's how I'm currently thinking about it:

For "Trigger Fanout Queries", that is where an advertiser has some conversion events (trigger events) and they want to understand what ads drove them (which source events) - I think we can potentially support long-lookback windows without enabling this "worst case" style situation. For starters, let's imagine that the "report collector" specifies an "epoch" argument when making an IPA query. This argument would let the helpers know which epoch's privacy budget this query is meant to consume. Furthermore, assuming we have attestation for the "epoch", we could simply ask the helpers to validate that all of the trigger events come from the epoch that was specified in the query. It would be OK to allow "source events" from preceding epochs (within some range - that would be limited by how often the helpers rotate their encryption keys, TBD. I know in the "auto" vertical advertisers like 90 day lookback windows - so that would be the aim - let's see if we can support that)

This sounds pretty good to me. This could allow limited replay of source events (associated with different triggers across different epochs), but if we bound it reasonably it might be OK (90 days might be pushing it but I think we can hash that out later). I am a bit curious on what the query model will actually look like for some campaigns with long lookback windows though, since the system will never tell you explicitly when a source event has been "counted" and should not be considered for triggers in future epochs. This design might actually encourage replay and double counting of sources if we're not careful even from legit parties. While I would like to recommend that they wait 90 days to issue a massive query with 90 days of data that is probably unrealistic :)

For "Source Fanout Queries", that is where an app or website that displays ads wants to compute some kind of "calibration" or "experimentation" use-case, counting attributed conversions across all advertisers, I have less clarity. Perhaps you have a suggestion! Perhaps we could just do the inverse - that is, validate that all of the "source events" come from the epoch specified in the query. This would require the Report Collector to "reserve" some privacy budget for a few weeks post the end of an epoch if they wanted to try to do a long lookback window query, then run a query to use that reserve budget up after they've received all of the potentially matching trigger events.

This seems reasonable, I think this use-case typically doesn't require super long lookback windows although I'm not an expert. Part of me would like to see the mechanism be unified across the different query types for simplicity though.

Speaking of lookback windows, I am actually not sure how lookback windows work in IPA besides for the selection of input events :) I'll file another issue for this.

benjaminsavage commented 2 years ago

Speaking of lookback windows, I am actually not sure how lookback windows work in IPA besides for the selection of input events :) I'll file another issue for this.

I've just filed another issue on this topic: https://github.com/patcg-individual-drafts/ipa/issues/16

martinthomson commented 2 years ago

My thinking all along was that we would bind the encrypted match key to the epoch in which it was generated so that it can't be used outside of that. And then follow the process @benjaminsavage described. These each have their own limitations, but those seem manageable.

eriktaubeneck commented 2 years ago

We should absolutely bind the epoch to the matchkey when provided by the user agent. I'll open a PR to update the end-to-end doc to that effect.

Exactly how that epoch is used in the query semantics seems to still be an open question. I generally agree with @csharrison that it's ideal if the API is unified, and there aren't different semantics for the different queries types.

In the current proposal, source and trigger fanout queries are already tied to the source and trigger site/apps (respectively) for the purpose of budgeting. For any given query, let's call these match keys used for the budgeting as the "primary match keys". If I'm understanding the above correctly: it seems reasonable that we only need to limit the epoch of the "primary matchkeys" (though not ignoring some global expiring of all match keys.)

This opens a few questions (which seem to be policy questions, not technical questions):

Can you run a query with "primary match keys" bound to an epoch in the past? We almost certainly need support for at least the immediately previous epoch, but should we support more than that, through the process of reserving some budget? a. If so, what should the maximum be? (90 days / ~13 weeks is suggested above.)
How many epochs before the non-primary match keys expire?
Should we support "primary matchkeys" bound to multiple epochs? (Which would require deducting budget across each of those included epochs.)

csharrison commented 2 years ago

I think if the epoch is bound to the match key then my opinion is to err on the side of more flexibility for the primary match key, since we know we can do the correct budget enforcement.

Yes this seems like a totally legit use-case, especially combined with (3). One example here would be reserving budget for backtesting of new features / slices that are invented after an epoch has ended.
We should probably circle back around on this one when we are ironing out privacy parameters since they will impact a decision here.
For this one you mean should we support a query where there exists "primary matchkeys" across multiple epochs? I think this is an important use-case for small players that might only get statistical significant data if they gather batches across epochs.

martinthomson commented 2 years ago

I agree with Charlie. The only necessary constraint here is the number of epochs the helpers agree to track budget for. So maybe you don't get to go back 3 years, but a month or 3 or 6 might be fine.

patcg-individual-drafts / ipa

Insecure timestamps allow replay attacks across epochs #11