twosixlabs / csl

Cooperative Secure Learning
MIT License
2 stars 0 forks source link

Consider other privacy definitions (down sensitivity / down privacy) #40

Closed davidslater closed 2 years ago

davidslater commented 3 years ago

In most human-understandable domains, we can provide reasonable bounds on data features (like human height, age, etc.). These dramatically impact the overall sensitivity.

However, in domains like imagery, we cannot generate those sorts of bounds, except at the raw pixel space. We are therefore protecting against the possibility of inferring whether a random noise image was in the training set. It would be ideal to constrain the data to be closer to the data manifold.

From the privacy attack standpoint, it would be fine (in my view) for an adversary to be able to determine that data point x1 was not in the training data, so long as it was not possible to determine that data point x2 was in the data set. This violates the main definition of differential privacy, but I think is warranted for this type of problem.