pytorch / opacus

Training PyTorch models with differential privacy
https://opacus.ai
Apache License 2.0
1.65k stars 328 forks source link

Opacus uses unsafe floating-point noise generation, even with secure_mode=True #616

Closed TedTed closed 2 months ago

TedTed commented 6 months ago

🐛 Bug

The _generate_noise primitive has a secure_mode parameter. When set to True, the documentation claims that the noise distribution is secure against floating-point attacks. This is wrong for two reasons.

Solutions to this vulnerability include using an approach based either on discretization (like what GoogleDP does), or on interval refining (like what Tumult Analytics does).

Additional context

As an aside, I do not understand why it makes sense to even have a secure_mode parameter, especially if the default is False. In which context does it make sense to use a DP library but not want the output to be actually DP?

HuanyuZhang commented 4 months ago

Thanks a lot to @TedTed for raising this to us. We totally agree that the current mitigations in Opacus is just an initial step towards various types of attacks to differential privacy. Note that the floating-point is just one of such, there are much more different types (e.g., timing attack). We will try to make Opacus more robust when we have bandwidth.

HuanyuZhang commented 4 months ago

@TedTed's second point is very interesting, and here is my POV: DP stands as a captivating theoretical construct. Nonetheless, its practical application necessitates certain concessions. For instance, the privacy amplification for DP-SGD only holds when there is "Poisson" subsampling. Despite this, many practical applications employ it with different subsampling technique, resulting in an estimation of the theoretical epsilon that still serves as a reasonable approximation.

Moreover, Opacus primarily serves as a platform for rapid prototyping and experimentation of novel ideas and algorithms. While it's feasible to integrate all essential mitigations, doing so could lead to other trade-offs such as diminished running speed, increased memory usage, or strain on developers' bandwidth, which slows down the development of other important features.

TedTed commented 4 months ago

Thanks for the comment. Two notes.

  1. Floating-point vulnerabilities are fundamentally different than timing attacks in that timing attacks only have a practical impact when the attacker can measure computation time. This is only the case for interactive use cases (like someone sending DP queries to data they don't have access to), and I'm not sure this ever makes sense for machine learning use cases. Floating-point vulnerabilities, on the other hand, can lead to real-world impact even if the attacker had no control over the training process.

  2. I'm surprised to hear that Opacus is primarily meant for prototyping and experimentation, and that the (ε,δ) guarantees can be approximations and not upper bounds. None of this seems to be prominently indicated in the documentation, which even suggests otherwise in multiple places. The FAQ says "Importantly, (epsilon, delta)-DP is a conservative upper bound on the actual privacy loss.". The introduction blog post mentions "Safety" as a core feature of Opacus. The version number, 1.0, suggests this is mature software that can run in production. As a result of all of this, Opacus ends up being used by other software libraries (like SmartNoise-Synth) and repackaged by vendors of synthetic data generation or private machine learning training software. What are your plans to make sure downstream users of your software are aware of its safety limitations?

HuanyuZhang commented 4 months ago

Thanks for your comments! Although I am not persuaded that the floating point attack is the top thing we should worry about right now, and whether there is a simple mitigation that incurs minimal memory and QPS degradation. I do agree that a better documentation is essential and should be helpful to the community. I will write a small post, as well as adding inlined comments in the next version. Thanks again for raising this issue to us.

TedTed commented 2 months ago

I see this is now closed, but I'm not seeing any new post discussing this, the FAQ hasn't changed, and the documentation hasn't changed either. Was this closed by mistake?