privacycg / private-click-measurement

Private Click Measurement
https://privacycg.github.io/private-click-measurement/
196 stars 8 forks source link

Tracking of individuals #83

Open annevk opened 3 years ago

annevk commented 3 years ago

The proposal does not currently prevent the tracking of individuals (whereby targeted individuals are assigned their own bit). While it cannot be abused for mass surveillance due to the relative small number of bits, this still seems like a problematic property of the current design.

michael-oneill commented 3 years ago

Any bits received would not be linkable to individials though, other than by source IP address, because no cookies and the .well-known url in the reports. There are several situations like this where IP blindess is an imporrtant issue.

Here is an interesting proposal for it: https://github.com/bslassey/ip-blindness/blob/master/willful_ip_blindness.md

csharrison commented 3 years ago

The typical solution to this is to use noise to achieve some level of differential privacy where either bits are flipped with some noise locally, or a central curator can do noisy aggregation (e.g. https://github.com/WICG/conversion-measurement-api/blob/main/AGGREGATE.md)

annevk commented 3 years ago

Thanks @csharrison, something like that could indeed work as I understand it.

@michael-oneill, this isn't the IP address problem. This is if the involved parties agree to set aside an attribution source ID bit for a specific individual.

michael-oneill commented 3 years ago

How else would a report link to the individual? By link I mean how could you associate that individual with any other data point. There are no cookies and the url path is contant.

The only piece of data you could use to link to reports would be the IP address.

johnwilander commented 3 years ago

The click source can assign each of its 256 possible source ID values to one user. That would allow them to associate 4 bits of entropy to those 256 users if they convert on the click destination site. This trade off has been made clear ever since we announced the proposal and initial implementation in 2019 and is the reason why we say PCM does not allow for web scale cross-site tracking.

As Charlie points out, noise is in the toolbox. However, in the case of Attribution Reporting API, the click source has ~18,440,000,000,000,000,000 possible values to assign to individual users which is designed for web scale cross-site tracking. Hence a completely different analysis of the click destination side with only 3 bits of entropy and noise.

Aggregation is another tool in the toolbox. It doesn’t provide event level data and requires an aggregator service. It’s a very different beast but nevertheless interesting.