patcg-individual-drafts / private-aggregation-api

Explainer for proposed web platform API
https://patcg-individual-drafts.github.io/private-aggregation-api/
41 stars 17 forks source link

`sendHistogramReport` is not an accurate name #44

Closed yoavweiss closed 1 year ago

yoavweiss commented 1 year ago

IIUC, sendHistogramReport doesn't in fact send a report, but queues up a (single) PAHistogramContribution.

As such, we may want to consider renaming it, e.g. to aggregateHistogramContribution or appendHistogramContribution.

Aside: would it make sense to enable it to get an array of PAHistogramContributions as input?

alexmturner commented 1 year ago

Changing makes definitely sense. Wdyt about contributeToHistogram()? This would avoid having to make it plural if we allow an array as input as well.

alexmturner commented 1 year ago

(Leaving open for the array suggestion)

xottabut commented 1 year ago

Hi Alex and Yoav, thanks for selecting the better name for API functions.

I just wanted to create a new issue with proposal to add the method that accepts multiple keys/values, but then I noticed that this issue left open for the array suggestion.

The proposal is that browser sends multiple reports to ad tech side in one request when those request passed in one API method call.

Our use case:

Benefits:

To preserve privacy:

Alex, what would be your thoughts about this proposal?

xottabut commented 1 year ago

Just found the next information about the batching: https://github.com/patcg-individual-drafts/private-aggregation-api#reducing-volume-by-batching

This is something that would work for us. The only concern for me is truncating to the 20 reports per batch, another proposed option with splitting bigger batches into few requests each with 20 reports max per one request looks better.

Is there any information what is the status of this proposal? Will it be implemented?

alexmturner commented 1 year ago

Hi @xottabut! Sorry for the delay in responding.

Yes, as you saw, our current design involves consolidating/batching any contributions in a "batching scope" together into one report. We're still considering allowing an array to be passed to contributeToHistogram(), but this would just be syntactic sugar -- the effects would be identical to multiple calls each with one contribution.

This batching plan is currently implemented in Chrome and available under an Origin Trial. You can find some developer documentation here (although I don't believe it discusses this batching): https://developer.chrome.com/docs/privacy-sandbox/private-aggregation/. You can also find some more discussion of the design here: https://github.com/patcg-individual-drafts/private-aggregation-api/issues/32. But basically, the limit of 20 came to mitigate the size increase that will be introduced with padding.

However, we're definitely looking feedback on the design, for example about:

It sounds like if we moved from truncation to splitting the contributions into multiple reports each with 20 (or fewer) contributions, that would work for you -- is that correct?

Can you also share more details around your specific use case?

Thanks for the feedback! Alex

xottabut commented 1 year ago

Hi @alexmturner , thank you very much for providing more details.

Yes, splitting them into multiple batches instead of truncating would work for us. Example of the use case is to have 6-month reports on the daily basis. With the current state of the API this requires to have a separate key for each day. And on any day the ad event will contribute at maximum to ~180 keys (situation when given ad is happening first time for the given chrome instance and chrome profile).

Based on the calculations you provided in #32 (~1.2 kB for 50 contributions and ~500 B for 20 contributions) having 50 contributions per one request instead of 20 would work better for us.

Here I have another question: since report is scheduled to be sent with a delay, will there be any batching for reports that were created from different worklet executions (i.e. different ad events)?

Summarizing for our use case with X duration on a daily basis reports:

Anatolii

yoavweiss commented 1 year ago

I opened #70 to tackle the "should we accept an array of contributions" question. @xottabut - If there are still open questions around batching, I think it'd be better to open a new issue specific to those questions.

Otherwise, I think we can close this to avoid confusion, as the OP issue was resolved with #48