privacycg / private-click-measurement

Private Click Measurement
https://privacycg.github.io/private-click-measurement/
198 stars 8 forks source link

Threat model and IP tracking #18

Open korg91 opened 5 years ago

korg91 commented 5 years ago

Hi, my name is Andrea Gadotti and I’m currently doing a PhD in the Computational Privacy Group at Imperial College London. Thanks @johnwilander for the exhaustive description of the mechanism! I’ve read the technical blog post with some colleagues and we find the idea very interesting.

In the blog post, you say that:

“Critically, our solution avoids placing trust in any of the parties involved — the ad network, the merchant, or any other intermediaries — and dramatically limits the entropy of data passed between them to prevent communication of a tracking identifier.”

However, if I understand correctly, conversions (through tracking pixels) and click attributions are sent directly by the user to search.example, revealing the user’s IP to search.example. Internet IPs have almost 32 bits of entropy, and indeed are often used for tracking. This means that search.example could still track the user across websites with fairly high probability.

Is this something you have considered or did we miss something?

johnwilander commented 5 years ago

Hi! IP-based tracking is not explicitly covered by the threat model as of now. However, the 24-48 hour delay is intended to, in part, introduce unreliability in terms of network state tracking.

Some devices obviously have a static or near static IP address, in which case other network privacy measures need to be deployed.

Since we currently don’t have a web scale, trustworthy clearing house, we can’t anonymize the network traffic that way.

korg91 commented 5 years ago

Thanks for your answer! I'm wondering how much of a weakness this is for the threat model in practice. I think there are at least two aspects to consider: the legal one and the technical one. As far as I know, personal IP addresses qualify as personal data under the GDPR, so they might not be legally used for tracking. This is an important deterrent, but it applies only to EU citizens and gives no technical guarantees. On the technical side, it would be interesting to see how effective IP tracking is over a 24-48h time span. Do you know any such study? Or are you aware of other legal/technical complications that make IP tracking not a real risk for ad-click attribution?

johnwilander commented 5 years ago

I don’t think web standards take legal aspects into account. At least, I’ve never seen such reasoning.

We can not point to any public study on IP address tracking combined with the 24-48 delay. However, as long as the legacy way of triggering a conversion remains, any IP based tracking for the purposes of ad click attribution is a problem of the pixel requests, rather than something new.

johnwilander commented 5 years ago

What I’m trying to say is that IP address tracking is a serious issue and whenever we have a solution/mitigation for it, this feature should make use of it. I just don’t think this spec should provide the solution.

What we probably should do though is to bring up the two issues you raise in the privacy analysis section.

korg91 commented 5 years ago

However, as long as the legacy way of triggering a conversion remains, any IP based tracking for the purposes of ad click attribution is a problem of the pixel requests, rather than something new.

As far as I understand, you plan to replace the legacy tracking pixel with a JavaScript API. But this would not make IP tracking any harder. Am I missing something?

What we probably should do though is to bring up the two issues you raise in the privacy analysis section.

I think that would be great! It would probably help clarify the threat model. Of course I'm happy to collaborate on that, so feel free to reach out.

What I’m trying to say is that IP address tracking is a serious issue and whenever we have a solution/mitigation for it, this feature should make use of it. I just don’t think this spec should provide the solution.

I'm not an expert in web standards, but I think it totally makes sense to leave IP tracking out of the threat model for this specification. However, I think this is an important aspect to consider when it comes large-scale deployment in browsers. In the blog post you say that you intend to have Privacy Preserving Ad Click Attribution on by default in Safari. Do you intend to do this before including a feature to prevent IP tracking?

hober commented 4 years ago

We should cover this in the Privacy Considerations section.

laughinghan commented 4 years ago

Hey @korg91, I think one piece of context you might have glossed over is that this is all about third-party cookies.

Until this year every browser other than Safari by default voluntarily sends advertiser-chosen, unlimited-entropy tracking identifiers whenever they fetch a tracking pixel. Efforts to change this default meet huge resistance in part because of its use for ad conversion measurement, so this feature presents a way to do that without third-party cookies. That’s why fixing involuntary, somewhat inaccurate network-based anonymity leakage isn’t really in scope here.

Large-scale deployment of this feature in browsers without any mitigation of IP tracking can only improve privacy, by reducing the resistance to blocking third-party cookies. So I don’t think IP tracking need be considered at all before deployment, other than to note that it’s unaddressed, after which I think this ticket can be closed.

(If in your research you figure out a way for browsers to obscure IPs that’s better than Tor though, that would definitely be awesome! IP “laundering” with a P2P Tor? NATs and CGNs in particular trip that up, but ironically, are also why IP tracking can be inaccurate, so mixed blessing there I guess.)