Closed stanleymarkman closed 3 years ago
This should closely interact with the data logging mechanism- for example, if while logging traffic we see a flag only expected in the url-params in the header instead, the data logging mechanism should notice that and pull it out of the stream for us to look at, and reevaluate the flag.
Right now, the privacy flags are just stored in a little list at the top of backgroundanalysis.js with some comments explaining what they do. The checkTruthy method will need to be adapted to work with the json format as well, and return the new "meaningfulness" datatypes, i.e. meanings.protected, meanings.unprotected, etc.
We should have an extendable format for these flags so adding new ones is easy without breaking our understanding of previous flags.
Right now, the privacy flags are just stored in a little list at the top of backgroundanalysis.js with some comments explaining what they do.
These are good points! One way to avoid breakage and improve OptMeowt's structure is to introduce a specification layer separate from the backgroundanalysis.js. @stanleymarkman, take a look at an example from PrivacyFlash Pro on how such a file could look like. That one is in YAML format but the same idea applies to JSON. With such an abstraction layer, if anything in the privacy flags changes, e.g., we need to add new ones or remove outdated ones, we can just change the JSON spec but the logic for processing the flags in the backgroundanalysis.js stays the same.
Per our discussion, @stanleymarkman has done this already, and the new spec layer will come into the main branch once we merge (issue #163).
This file is currently called privacy_flags.js and in src/data.
I started working on the privacy flags and wrote up some comments.
@stanleymarkman, just to clarify, the format that you are using for the flags in the privacy_flags.js is not always reflected in the specs of the flags. For example, for the us_privacy
string the format are strings like 1NYN
and there is no single value protected
. This is probably your approach of coming up with a uniform representation that we can use internally in our extension, right?
With issue #163 being merged, this is done.
We only have a few privacy flags so far, but there will inevitably be more, possibly with complex settings/meanings (beyond truthy/falsy, etc.) We should have an extendable format for these flags so adding new ones is easy without breaking our understanding of previous flags. Something like this (pseudocode, not meant to be actual info about GDPR)
{ flag = "gdpr", values : { "1": "meanings.protected" "0":"meanings.nonprotected" "":"meanings.unset" }, variants : { "GDPR" "Gdpr" "gdpr%32" }, jurisdiction="eu", legislation="General Data Protection Regulation" locations:{ "urlparam", "header" } importance=7 }
(might not be a good idea, but as we get a very large number of flags prioritizing some might be a good idea, hence the numerical 'importance' value to show how binding/useful we consider a certain flag for our analysis) Of course, every flag has a slightly different meaning, but we need to generalize them into as small a number of categories as possible- maybe just like "protected data", "unprotected data".