Closed sms1097 closed 9 months ago
What you are describing is termed "user-level DP". One approach is to directly use the per-sample clipped DP-SGD as in this library, which also guarantees the user-level DP. Another approach is to reformat DP-SGD and use per-user gradient clipping instead. This is usually worse in accuracy but easier/faster to compute.
Let's say I want to fine-tune a model with multiple examples from the same user. In the typical DP guarantee, we make sure no observation distinctly contributes, but what if we want to make sure no user distinctly contributes? Can I reformat the usage of DP-SGD such that I can group the privacy by a set of observations rather than a single observation?