w3ctag / privacy-principles

https://w3ctag.github.io/privacy-principles/
Other
49 stars 16 forks source link

"low-enough granularity" geolocation is naive #281

Closed martinthomson closed 1 year ago

martinthomson commented 1 year ago

As with many things privacy-related, the naive notion is often badly wrong.

See Section 2.4:

Precise location information can be extremely sensitive [...] or it might be low-enough granularity that it is much less sensitive for many people.

Section 13.5 of RFC 6772 is something I refer people to often in these cases. This highlights how an intuition about "precision" can be badly wrong and how apparent fuzziness in data can hide a great deal of information. (This is also true for de-identification, which is routinely done poorly; but I'll open other issues about that.)

A good example is a paper written at approximately the same time that was able to recover geolocation at near-GPS levels of precision, using only location data from cell towers combined with knowledge of the terrain (I tried to dig up a reference, but was not successful; it's been a while; I wish that we'd manage to cite it). I have no idea what modern machine learning techniques would make of that problem, by the way.

jyasskin commented 1 year ago

(I renamed the issue to avoid code-of-conduct problems, which I don't think Martin really intended)

Websites often need to know which legal jurisdiction a user is in, which is low-granularity geolocation. Am I still naive to think that that level of granularity is "less sensitive for many people"? We might be missing a general warning that it's harder than expected to be sure that information is actually gone when you try to obscure it, which would apply to de-identification too. For legal jurisdiction, for example, a transition between jurisdictions reveals that the person was precisely on the border at some point during the transition.

npdoty commented 1 year ago

I don't know that websites need to know legal jurisdiction, but I do think there is a difference in sensitivity between, for example, country and something precise, and also that we should cite to that RFC and warn that even low granularity can in many cases be used to infer something more precise.

jyasskin commented 1 year ago

@martinthomson, please reopen this if #339 didn't fully address it. We merged that as an improvement, but we're happy to continue to improve things.