Open BasileLeparmentier opened 4 years ago
That's great, Basile, thank you — I believe those are all feasible, though of course there is work to be done in hammering out the details.
Hi Michael,
not sure I was clear (I see it now that it is not clear this is just an introduction to a document) but there is a full document on detailing this proposal https://github.com/BasileLeparmentier/SPARROW/blob/master/Reporting_in_SPARROW.md
Sorry for the absence of clarity. Best, Basile
Yikes, sorry, I completely missed that! I'll look at the details soon.
Your comment still stands and I think even with it there will be details to be hammered, but less than you thought by thinking this was all the proposal^^.
Best
Sorry for the delay on a more detailed response!
At a high level, my reactions are:
I still think we will want the "privacy-preserving mechanisms" to include something like differential privacy rather than the k-anonymity that you use in your examples. But this is a pretty well-studied problem (the database being queried is all in one place, the number of queries is limited, etc), so we can implement a best-in-class solution here.
I'm very unsure about privacy relying on "we propose that access to the reports be conditioned to a legally binding agreement that those two sources of data are never crossed". I can't see how the the Gatekeeper would be in a position to know if two people that it gave reports to were sharing information, and I don't want to even imagine what this might imply if, say, one company purchased another one. I think we will be on much safer ground if we keep the threat model where the Gatekeeper pessimistically assumes that everyone is sharing the reports it gives out, and make the reports private enough that we don't mind.
I'll open separate issues about more specific questions.
Hi Michael,
Thanks a lot for your feedback.
On your first point, we did propose k-anonymity because we don't think differential privacy is adapted to the use cases of online advertising. This is a complex topic and we intend to publish a blog post to explain our position, but the gist of our position is that differential privacy is ill-suited when:
In this new reporting scheme, the privacy leaks are very close to zero so legal agreement are not necessary. We think that solving the multi-faceted problem using solely technical means would lead to an unsatisfying solution. While I, a techie, have the same tendency to try to engineer my way out first, sometimes the last mile is handled significantly better via other means. Therefore, we think that we should not forget that other tools are available and that other industries handle privacy issues with some Chinese walls and compliance processes rather than sheer engineering power (by the way, Google also uses this type of schemes if I am not mistaken: despite belonging to the same company, Google Ads do not have access to the browsing ID and the full browsing history of the user for targeting purposes). We should try to adopt the same methods when the remaining corner cases are small enough and the cost of handling them using engineering outweighs the benefit.
This legal agreement / set of predefined rules would only be there to cover the last bits that were not handled technically.
Asking the DPA's to be in charge of auditing any legal points that might be used for TURTLEDOVE/SPARROW could be an option.
OK, I look forward to your blog post, and further discussion on differential privacy and other approaches to meet the privacy needs.
I am indeed very interested in both technical and policy approaches — and of course trusting a Gatekeeper is itself a policy choice, from the browser's point of view. But as policy-type solutions go, I don't particularly like giving out information with privacy properties that depend on forbidding collusion.
Hi Michael,
Sorry for the delay, I was in vacation, but you can find our blog post on differential privacy and why we think it has strong limitation in the case of online advertising here https://github.com/Pl-Mrcy/privacysandbox-reporting-analyses/blob/master/differential-privacy-for-online-advertising.md .
Best, Basile
Hi,
Thanks to the many constructive feedbacks on the SPARROW proposal, we are happy to propose a new version for the reporting capabilities. We believe that this proposal improves on securing users privacy without much compromises on the advertising use cases that SPARROW aims at preserving. In order to do so, we replaced the log-level reporting by three different levels of reporting, each of them playing on granularity and delay to serve different advertising use cases:
With this actionnable proposal, which should be precise enough to be implemented, we believe we address the concerns on privacy attacks on SPARROW, satisfying privacy sandbox requirements, preserving most of the ecosystem current capabilities, and ultimately allowing for a fair, thriving advertising-backed Open Web. Once again, we thank the community in advance for their feedbacks as they'll help bolster the SPARROW proposal.
Detailed document can be found here: https://github.com/BasileLeparmentier/SPARROW/blob/master/Reporting_in_SPARROW.md