patcg / docs-and-reports

Repository for documents and reports generated by this community group
Other
9 stars 12 forks source link

Add private single events to areas of agreement #43

Closed csharrison closed 1 year ago

csharrison commented 1 year ago

Fixes #41

Per the call on May 4 I'll leave this PR open for some time to get feedback.

ShivanKaul commented 1 year ago

Given that people using multiple devices/browsers will now not get the same match key, we can't guarantee the contribution limit per-user. Right?

csharrison commented 1 year ago

Given that people using multiple devices/browsers will now not get the same match key, we can't guarantee the contribution limit per-user. Right?

Yeah currently there aren't any measurement proposals that can enforce a sensitivity bound across devices, although it's something that I think we should invest some time into trying to do (e.g. this is possible to some extent with vendor provided match keys). I tried to keep this general so that things like tightening budgets based on estimates of devices per user is a feasible mitigation.

csharrison commented 1 year ago

That is, my contribution is, to some extent, hidden by the noise of others' contributions. We don't have a strong formalism for that, but that's a useful intuition that we might be able to rely on. My guess is that this is why aggregated values perform worse: because the noise that the training system experiences is concretely higher when aggregated, even though the formal protections remain the same

This is true for DP-SGD style learning because the noise is applied to the entire gradient in order to keep both the features and labels private, and the gradient can be huge. There could very well be aggregate training techniques in the "label DP" setting which outperform single-event queries, we just don't know of an algorithm yet. Generally speaking aggregation performs better than applying noise to every input because you don't have any noise in others' contributions - you just apply a O(1) noise share to the entire aggregate.

martinthomson commented 1 year ago

That sounds like the kernel of a solution to me. In the general case, maybe you can't ensure that the gradient (or even a given dimension of it) isn't exclusively the result of the contributions of a single user. But I would be surprised if it wasn't possible to bound the sensitivity to be less than the size of the entire gradient.

AramZS commented 1 year ago

It looks like we've come to a settled place on this. I think it is reasonable to merge in at the two weeks point.