dvadym commented 3 years ago

Context

The privacy budget accounting is an important feature of the DP aggregation system. On the issue BudgetAccountant was implemented. The BudgetAccountant uses the naive composition, namely the total budget is the sum of budgets:

(𝜺_1, 𝛿_1) + (𝜺_2, 𝛿_2) = (𝜺_1+𝜺_2,𝛿_1+𝛿_2).

The downside of using the naive composition is that the total budget is growing linearly with the number of DP aggregations. There are others, more advanced ways of composing budgets, such that the total budget is growing slower than linearly in the number of DP aggregations.

One of those ways is PLD (privacy loss distributions, math details), which is implemented in dp_accounting library.

Goals

Implementation class PLDBudgetAccountant, which uses dp_accounting library.

class MechanismType(Enum):
   LAPLACE = 1
   GAUSSIAN = 2
   # more mechanisms will be added later

class PLDBudgetAccountant:
   def __init__(self, total_eps : double, total_delta : double):
   def request_budget(self, weight : double, mechanism_type : MechanismType, sensitivity:float) -> Budget
   def compute_budgets(self):

class PLDBudget:
   @property def noise_parameter()

The idea is that PLDBudgetAccountant collects all requests on budget usage from different DP operations (the same way as BudgetAccountant). request_budget() returns a PLDBudget object which does not have yet noise_parameter. And then when the pipeline is constructed, (i.e. all DP operations are known) compute_budget is called and it computes the minimum noise for each DP aggregation, such that the total budget is not more than (total_eps, total_delta)

Note: for some DP operations it might be more important to get more accurate result, that is why weight parameter is used.

Notes on implementation

Note: In this task we can consider PLD as a blackbox, with the following properties

a PLD object corresponds to a DP mechanism
there is a way to compose PLD object and the result is a PLD object, which corresponds to composition of corresponding DP mechanisms
PLD allows to compute eps for given delta.

Using PLD for composing of DP mechanisms

Let input is 0<= delta < 1 and DP mechanisms (M_1, ..., M_n)

Example: M_1 is a gaussian mechanism with std = 2, M_2 is laplace mechanism with std = 1 etc

Goal: to compute minimal eps, such that composition of (M_1, ..., M_n) is (eps, delta)-DP.

With PLD it works the following way:

Convert M_i -> PLD_i (using methods from_laplace_mechanism and from_gaussian_mechanism ).
Use method compose to get pld which correspond to the composition
Compute eps = pld.get_epsilon_for_delta(delta)

Note: For PLD the language of (eps, delta) is not natural. There is not much sense to talk about (eps, delta) of a specific mechanism, it's better to think about (eps, delta) of the whole pipeline.

Numerically calibrating noise level

In contrast to the previous subsection, we don't have noise parameters. We would like to find them. Assume that in the setup of the previuos subsection:

we have mechanism (M_1, ..., M_n), with known types (Laplace or Gaussian for now) and a standard deviation of M_i is x/weight_i , where x > 0 is unknown.

The goal is to find minimum x, such that the composition of (M_1, ..., M_n) is (eps, delta)-DP.

Using approach of the previous subsection, for each x > 0 we can compute eps(x). Note, that *eps(x) is decreasing function. So we can do a binary search by x.

Links

dp_accounting library
(optional, more details on PLD) David M Sommer, Sebastian Meiser, and Esfandiar Mohammadi. Privacy loss classes: The central limit theorem in differential privacy. Proceedings on Privacy Enhancing Technologies, 2019.

jspacek commented 3 years ago

👋 @dvadym I'm interesting in working on this issue, and paired programming would be helpful 🍐 !

dvadym commented 3 years ago

Thanks!

dvadym commented 3 years ago

Note: This is a description of an extension of PLD-based accountant.

Support of generic (eps, delta)-dp mechanisms

Context

Now we have support of Laplace and Gaussian mechanism, but there are more DP mechanisms. This feature is about allowing to support, any mechanism, for which only (eps, delta) parameters are known.

The dp_accounting library supports creating PLD for (eps, delta) mechanisms (link).

Notes:

Usually PLD provides a better utility when DP mechanism type is known and supported by PLD (currently Laplace, Gaussian and Randomized response are supported by dp_accounting library). But it's not always the case.
Private partition selection is a good use case where support of (eps, delta)-dp mechanisms are needed.

Changes

Add a new element in MechanismType.
Add provide a way to return eps, delta in MechanismSpec (or maybe a new class?).
Incorporate generic (eps, delta) mechanisms into _find_minimum_noise_std - there should be some conversion between (eps, delta) and std of the noise (happy to chat about that).

Calibrating std of the noise vs (eps, delta) (new section)

The current calibration works that way:

The function compose_distributions maps std -> PLD, and
Then we get eps, from the obtained PLD.
That's we have the function std -> eps, which is decreasing and so we can do binary search

But now we'd like to introduce a generic (eps, delta)-dp mechanism, which doesn't have natural definition of std. There are multiple ways to do that. In principle any way, which maps "std" -> (eps, delta) in some decreasing way would be enough to build 1d optimization problem with a decreasing goal function. One of the possible ways is the following:

Pretend that (eps0, delta0) correspond to Laplace mechanism. We need to find (eps0, delta0) in compose_distribtuions from std. The parameter of this Laplace mechanism is
```
b = 1/eps0 => eps0 = 1/b
std = sqrt(2)b => b = std/sqrt(2), 
```
Hence eps0 = 1/b = sqrt(2)/std
And let's take delta0 is to be proportional to eps0, i.e. eps0/eps_total = delta0/delta_total => delta0 = eps0/eps_total*delta_total
In compose_distribtuions function let's use from_privacy_parameters to create PLD for (eps0, delta0)-dp mechanism.
The rest of the calibration is the same.

Note: As mentioned there are different ways how to map std -> (eps, delta), which provides different utility tradeoffs between Laplace/Gaussian vs (eps, delta)-dp generic mechanism.

dvadym commented 3 years ago

@jspacek FYI I've updated a link how to create PLD from (eps, delta) guarantees (step 3 in description)

OpenMined / PipelineDP

PLD based Budget Accountant #35

Context

Goals

Notes on implementation

Using PLD for composing of DP mechanisms

Numerically calibrating noise level

Links

Support of generic (eps, delta)-dp mechanisms

Context

Changes

Calibrating std of the noise vs (eps, delta) (new section)