Per-impression privacy budgeting vs. explicit conversion limits

csharrison commented 1 year ago

The current design for PAM requires the impression and conversion to “agree” on how many conversions an impression can contribute to. This works OK in the steady state, but during “strategy” migrations it can be fragile. For example, imagine I am running a campaign where I set the conversion limit to 3, but I realize halfway through that the noise is just too high so I want to reduce this to 2. So, I change the impression-side registrations to use 2 instead of 3, but at this point we still have a bunch of pending impressions with 3 that are not compatible with “2” conversions, so it isn’t clear how to complete this migration on the conversion side.

The ARA approach solves this by giving each impression a fixed “budget” that it can “spend” somewhat arbitrarily at conversion time, and if an impression doesn’t have enough budget it results in no contributions. This gracefully degrades during migrations. In the example above, we would just begin consuming /2 per conversion from each impression, rather than /3.

This approach of maintaining an abstract "contribution budget" can allow for other flexibility, so stay tuned for a follow-up on this.

winstrom commented 1 year ago

I'd love to see the this written out. I think our system is trying to accomplish something similar but perhaps it's been articulated in a convoluted way. We did want to enable variable budgets per impression rather a fixed privacy budget but perhaps that's not necessary in practice.

csharrison commented 1 year ago

OK I can try to write out something. Is my understanding correct though that at impression time you have to say "use this ad in X conversions" and at conversion time you have to say "consider only ads that can be used Y times" and then when you do attribution you only consider matches when X = Y? This was my reading of the proposal.

winstrom commented 1 year ago

I see - the idea was that at impression time you would say "use this ad in X conversions" and at conversion time you would say "this conversion adds enough noise in the aggregate for ads that specify they can be used Y times". Then it's considering the matches with X <= Y. This is trying to let the ad impression specify the amount it can contribute to a particular conversion and letting the advertiser decide on where to send these pieces.

However, imagine that you've determined that you really need more information and you pushed out a bunch of ads that can be used in 2 conversions and you realized your noise was too low. Nothing would stop you from running two conversion reports on each device which would put the entire signal into that aggregation -- and the noise would be unchanged so you would double the signal to noise ratio.

winstrom commented 1 year ago

But reading your initial comment, I think it may be a clearer definition than what we've written down.

csharrison commented 1 year ago

Thanks Luke yeah I was missing the inequality. In this case:

The migration to decide on higher # of conversions per impression is handled, since older impressions which specify X <= Y will still go through.
The migration to decide on a lower # of conversions per impression is handled by the following sequence:
- Update the impression logic to migrate X to X - 1
- Wait until all old impressions are expired. During this intermediate period the new impressions get more effective noise.
- Update conversion logic to migrate Y to Y'

This is not too bad, but IMO it still makes migrations a bit harder to reason about than in ARA when the only place "budget" decisions need to occur is at conversion time.

However, imagine that you've determined that you really need more information and you pushed out a bunch of ads that can be used in 2 conversions and you realized your noise was too low. Nothing would stop you from running two conversion reports on each device which would put the entire signal into that aggregation -- and the noise would be unchanged so you would double the signal to noise ratio.

Ack - this is fine but depending on the API surface to register may not be the best way to do it. E.g. in ARA we integrated at the HTTP layer, and so "running two conversion reports" naturally ends up looking like adding another round trip.

patcg-individual-drafts / private-ad-measurement

Per-impression privacy budgeting vs. explicit conversion limits #3