Closed BismeetSingh closed 2 years ago
In differential privacy, we add noise proportional to the spread of the data to privatise it. The bounds gives the model the information it needs to calculate the spread of the data and calibrate the noise accordingly.
How do we calculate these bounds?
To preserve differential privacy, this needs to be done using knowledge of the domain of the data (i.e., without looking at the data itself). If this is not possible, then the bounds can be calculated on the data itself (even though this is a violation of differential privacy). To calculate the bounds on a dataset X
, you can use bounds=(np.min(X, axis=0), np.max(X, axis=0))
.
I understand what epsilon is but how do bounds affect the model?