patcg / docs-and-reports

Repository for documents and reports generated by this community group
Other
9 stars 12 forks source link

threat model: (small) group privacy #9

Open npdoty opened 2 years ago

npdoty commented 2 years ago

The threat model currently considers leaks of information about individual users. However, leaks of information about groups of users may still have very significant privacy threats.

In particular, for small groups, revelations that some subset of the group visited a particular site could be very sensitive. For example, if a teacher learns that X of their Y students have visited a webpage about a certain health condition or procedure, the students may be very surprised and concerned that that information was revealed, even if the teacher cannot determine which student visited which site. These threats are especially relevant in cases of some power asymmetry: students may be compelled to reveal additional information about themselves once some information is uncovered to teachers or administrators.

Group privacy also has impacts on individual privacy. If the aggregator learns with high confidence that the vast majority of a certain annotated group has visited a site or taken an action, then the aggregator can also conclude that a user in that group likely visited a site or took an action. Annotations about the group may not be visible in the measurement protocol, but could be additional information known to the aggregator by communication with the site or app, for example.

bmayd commented 2 years ago

It is reasonable to assume that parties interested in gaining insight from a dataset would take a holistic approach, using both information about individuals to increase their understanding of the group and information about the group to inform their understanding of individuals. Focusing on individuals without considering the group seems insufficient.

I think we would also do well to keep in mind that information has value because it supports inferences about a broader context, that information about a segment of a population allows for meaningful assumptions to be made about the rest of the population. When we know that a subset of a population shares a particular characteristic, we also know the rest of the population does not share that characteristic.

To get a meaningful understanding of potential impacts I think we need to consider what is revealed about individuals, groups and the overall context.