opendp / smartnoise-core

Differential privacy validator and runtime
MIT License
290 stars 33 forks source link

Incorrect sensitivity calculations #356

Closed dkifer closed 3 years ago

dkifer commented 3 years ago

It appears that the sensitivity calculations in this document (and possibly others) are incorrect: https://github.com/opendp/smartnoise-core/blob/develop/whitepapers/sensitivities/mean/mean.pdf

Specifically, Section 2.1 covers the add/remove version of sensitivity of the mean. In this case, if you add or remove a person, the quantity n (number of people in the database) cannot be used for sensitivity calculations. The bug in the proof is that it considers neighbors of the current dataset (from which it picks out n), while it should be considering all possible pairs of datasets that are neighbors of each other.

A secondary problem is that the mean for an empty dataset is not defined, but has to be factored into the computation of sensitivity.

Typically, for unbounded differential privacy (add/remove a person version), the mean is computed by adding noise to the numerator, adding noise to the denominator, and then dividing. In practice, it is best to return not just the noisy mean, but also the noisy numerator and noisy denominator since the user already "paid" the privacy cost for them.