Open alois-bissuel opened 3 years ago
Thanks for filing:
Feel free to add to the agenda if you want to discuss in the meeting (https://bit.ly/ara-meeting-notes).
Hi, Thanks for those clarifications. About our use cases with a larger number of histogram contribution: We are currently working on some solutions to try to learn our ML models using some aggregated data. Let me first state that this is not an easy problem. The current "best" (or maybe "least bad" would be more accurate) solutions we have are as follow:
As a side note, we are aware that it is possible to sample the keys to keep only "a small number of keys" on each displays, and that the noise would be scaled up when the number of keys is large anyway. If I am not mistaken, with Laplace noise the signal to noise ratio is actually equivalent. But with a Gaussian (instead of Laplace) noise, allowing a large number of keys should give a much improved signal to noise ratio.
To summarize, this "small number of histogram contribution" might cause yet another big performances drop for the models we could learn; I am personally not confident that at the end of the road the models would be "good enough" to sustain a viable ecosystem.
Thanks @AlexandreGilotte ! Yes I agree the ML use-case is a clear case where it seems useful to contribute to many aggregate keys at once. I am happy to consider changes to the API to make sure that the use-case can be done in a performant and private way.
One thing we should think about is how to make sure the entire system knows how to scale the noise. Our current design only has a single L1 sensitivity so some of the techniques necessarily for more advanced DP composition don't really work assuming worst-case input. This has great benefits in terms of simplicity (you can allocate the L1 budget however you want, and the system can be oblivious to it) but it does not maximize utility for this use-case.
This is on the agenda for the meeting today, so let's try to get into these problems :)
We discussed this in the call yesterday (minutes).
I raised two concerns with supporting events contributing to many buckets (~hundreds):
Thanks for you answer. Regarding the two concerns you voiced, may I ask some further clarifications?
On a roughly connected subject, it seems to me that using a Gaussian noise (made worthwhile because of Linf) would be uniquely suited for MPC, as the two (or more) servers could independently add their noise, which is then a Gaussian. A quick look at appendix E of the DPF paper shows they use a sum of Laplace noise, which has not this summing property.
Got it! This definitely makes sense. Here, Linf is used to control the L2 sensitivity for Gaussian noise we applied. In other words, if Linf=1, the L2 sensitivity is bounded by sqrt(171). If there were no Linf, the L2 sensitivity would have been 171.
Yeah that's exactly right. The L2 sensitivity is just sqrt(L1*Linf).
On a roughly connected subject, it seems to me that using a Gaussian noise (made worthwhile because of Linf) would be uniquely suited for MPC, as the two (or more) servers could independently add their noise, which is then a Gaussian. A quick look at appendix E of the DPF paper shows they use a sum of Laplace noise, which has not this summing property.
I don't think the DPF paper is doing the optimal approach. If both servers sampled from the difference of two Polya RVs the sum would be distributed according to Discrete Laplace (or two-sided geometric). This link has more details. I don't think MPC constrains the design choices here.
Revisiting this, I am wondering if it is feasible for parties to advertise via some global configuration what kind of sensitivity bounding they are interested in. This would have to be global because we'd need all users to obey the same constraints, and have noise applied in a uniform way downstream in the aggregation service.
This would introduce a lot of complications, but it seems like a technique that would generally work without just picking a place in the constraint-space that's a middle ground position.
I am not sure to follow your proposal. What do you mean by parties and users here?
Parties: reporting origins Users: devices / browsers
Essentially I am thinking of a speculative new mechanism where e.g. criteo.com hosts a file saying "Please bound my contributions such that the L2 sensitivity <= xxx".
Then browsers have a mechanism to read these files and apply the appropriate sensitivity bounds on the user contributions, instead of the default L1 bounds we have in the API today. This would increase the contributions allowed per user in cases when you are guaranteed to "spread it out" across multiple buckets.
Hi, Thanks again for all the proposals and the interesting discussions. We believe the conversion measurement API has a great potential. There are a few limits we would like to understand though, especially in the aggregation API. In the explainer for the aggregation API, there are limitations on the contributions which can be made to the histogram. While the cap on L1 value is obvious for differential privacy to work, we do not understand the limit on the number of contributions to be made (small, eg 3). Also, why should the aggregate report be scoped on the tuple (source_site, attribution_destination)? We believe that there are great benefits in aggregating reports from multiple source_site (or attribution_destination, depending on the use case) in a single request, to lower the overall level of noise. Thanks a lot for you answer, and we would be happy to discuss this live in the next meeting if needed.