privacy-tech-lab / gpc-optmeowt

Privacy browser extension for opting out from web tracking via GPC
https://www.privacytechlab.org
MIT License
152 stars 16 forks source link

Implement US Privacy flag detection #159

Closed stanleymarkman closed 3 years ago

stanleymarkman commented 3 years ago

We only have a few privacy flags so far, but there will inevitably be more, possibly with complex settings/meanings (beyond truthy/falsy, etc.) We should have an extendable format for these flags so adding new ones is easy without breaking our understanding of previous flags. Something like this (pseudocode, not meant to be actual info about GDPR) { flag = "gdpr", values : { "1": "meanings.protected" "0":"meanings.nonprotected" "":"meanings.unset" }, variants : { "GDPR" "Gdpr" "gdpr%32" }, jurisdiction="eu", legislation="General Data Protection Regulation" locations:{ "urlparam", "header" } importance=7 } (might not be a good idea, but as we get a very large number of flags prioritizing some might be a good idea, hence the numerical 'importance' value to show how binding/useful we consider a certain flag for our analysis) Of course, every flag has a slightly different meaning, but we need to generalize them into as small a number of categories as possible- maybe just like "protected data", "unprotected data".

stanleymarkman commented 3 years ago

This should closely interact with the data logging mechanism- for example, if while logging traffic we see a flag only expected in the url-params in the header instead, the data logging mechanism should notice that and pull it out of the stream for us to look at, and reevaluate the flag.

stanleymarkman commented 3 years ago

Right now, the privacy flags are just stored in a little list at the top of backgroundanalysis.js with some comments explaining what they do. The checkTruthy method will need to be adapted to work with the json format as well, and return the new "meaningfulness" datatypes, i.e. meanings.protected, meanings.unprotected, etc.

SebastianZimmeck commented 3 years ago

We should have an extendable format for these flags so adding new ones is easy without breaking our understanding of previous flags.

Right now, the privacy flags are just stored in a little list at the top of backgroundanalysis.js with some comments explaining what they do.

These are good points! One way to avoid breakage and improve OptMeowt's structure is to introduce a specification layer separate from the backgroundanalysis.js. @stanleymarkman, take a look at an example from PrivacyFlash Pro on how such a file could look like. That one is in YAML format but the same idea applies to JSON. With such an abstraction layer, if anything in the privacy flags changes, e.g., we need to add new ones or remove outdated ones, we can just change the JSON spec but the logic for processing the flags in the backgroundanalysis.js stays the same.

SebastianZimmeck commented 3 years ago

Per our discussion, @stanleymarkman has done this already, and the new spec layer will come into the main branch once we merge (issue #163).

SebastianZimmeck commented 3 years ago

This file is currently called privacy_flags.js and in src/data.

SebastianZimmeck commented 3 years ago

I started working on the privacy flags and wrote up some comments.

@stanleymarkman, just to clarify, the format that you are using for the flags in the privacy_flags.js is not always reflected in the specs of the flags. For example, for the us_privacy string the format are strings like 1NYN and there is no single value protected. This is probably your approach of coming up with a uniform representation that we can use internally in our extension, right?

SebastianZimmeck commented 3 years ago

With issue #163 being merged, this is done.