awslabs / fast-differential-privacy

Fast, memory-efficient, scalable optimization of deep learning with differential privacy
Apache License 2.0
83 stars 11 forks source link

Privacy for Each Data Contributing User #27

Closed sms1097 closed 6 months ago

sms1097 commented 6 months ago

Let's say I want to fine-tune a model with multiple examples from the same user. In the typical DP guarantee, we make sure no observation distinctly contributes, but what if we want to make sure no user distinctly contributes? Can I reformat the usage of DP-SGD such that I can group the privacy by a set of observations rather than a single observation?

woodyx218 commented 6 months ago

What you are describing is termed "user-level DP". One approach is to directly use the per-sample clipped DP-SGD as in this library, which also guarantees the user-level DP. Another approach is to reformat DP-SGD and use per-user gradient clipping instead. This is usually worse in accuracy but easier/faster to compute.