pytorch / opacus

Training PyTorch models with differential privacy
https://opacus.ai
Apache License 2.0
1.72k stars 345 forks source link

Discovering the meaning of ε-DP. #26

Closed sergevanhaag2 closed 4 years ago

sergevanhaag2 commented 4 years ago

name: "Feature Request" about: Submit a proposal/request for a new feature


Feature

This tool can guarantee certain values of ε, however, the meaning of this ε is dependent on the context. Is there a way to understand what level of privacy this ε ascertains? In the original paper on ε-DP (which is different than Renyi DP that has been implemented in this case) it becomes clear that:

Given ε ≥ 0, a mechanism M is ε-differentially private if, for any two neighboring databases D and D′ and for any subset S ⊆ R of outputs: Pr[M(D) ∈ S] ≤ e^ε ·Pr[M(D′) ∈ S].

As I said, this is different than Renyi DP and also different than (ε, δ)-DP; but the tenor is similar. Is the possible to find out what the value of Pr[M(D′) ∈ S] and Pr[M(D) ∈ S] are by using PyTorch-DP. Because if the value e^ε becomes greater than ·Pr[M(D′) ∈ S], the ε loses all meaning. Because any probability will be smaller or equal to a probability that is greater than 1. Or a similar statistic that can be used to say something meaningful about "what real level of privacy" this ε ensures.

ilyamironov commented 4 years ago

Sorry for not coming back to your questions earlier.

Let me unpack. First, you ask about the guarantees offered by (α, ε)-RDP compared with ε-DP. Second, how about ε that makes eε · Pr[M(D′) ∈ S] > 1?

You are indeed correct that our privacy objective is different from ε-DP, the so-called pure differential privacy. Internally, we use the RDP accountant, which tracks (α, ε)-RDP for multiple values of α. Each such (α, ε)-RDP implies two statements:

In turn, (ε, δ)-DP is a relaxation of the pure differential privacy to Pr[M(D) ∈ S] ≤ eε · Pr[M(D′) ∈ S] + δ. Does it help?

Now, you ask about ε so high that eε · Pr[M(D′) ∈ S] > 1. Does it really make the privacy guarantee meaningless? The answer is no, because you want to consider instead the event that Pr[M(D) ∉ S]. Since in the definition of differential privacy the roles of D and D′ are symmetric, they can be swapped. Thus, we have that Pr[M(D) ∉ S] ≥ e · Pr[M(D′) ∉ S]. Therefore, Pr[M(D) ∈ S] = 1 - Pr[M(D) ∉ S] ≤ 1 - e · (1 - Pr[M(D′) ∈ S]).

For instance, if ε = 1 and Pr[M(D′) ∈ S] = 0.5, the straightforward bound Pr[M(D) ∈ S] ≤ e / 2 ≈ 1.35 is indeed vacuous. The bound that we just derived above has it that Pr[M(D) ∈ S] ≤ 1 - e-1 · (1 - 1/2) ≈ 0.81.