Closed dvadym closed 2 years ago
๐ @dvadym I'm interesting in working on this issue, and paired programming would be helpful ๐ !
Thanks!
Note: This is a description of an extension of PLD-based accountant.
Now we have support of Laplace and Gaussian mechanism, but there are more DP mechanisms. This feature is about allowing to support, any mechanism, for which only (eps, delta) parameters are known.
The dp_accounting library supports creating PLD for (eps, delta) mechanisms (link).
Notes:
_find_minimum_noise_std
- there should be some conversion between (eps, delta) and std of the noise (happy to chat about that).The current calibration works that way:
But now we'd like to introduce a generic (eps, delta)-dp mechanism, which doesn't have natural definition of std. There are multiple ways to do that. In principle any way, which maps "std" -> (eps, delta) in some decreasing way would be enough to build 1d optimization problem with a decreasing goal function. One of the possible ways is the following:
Pretend that (eps0, delta0) correspond to Laplace mechanism. We need to find (eps0, delta0) in compose_distribtuions from std. The parameter of this Laplace mechanism is
b = 1/eps0 => eps0 = 1/b
std = sqrt(2)b => b = std/sqrt(2),
Hence
eps0 = 1/b = sqrt(2)/std
And let's take delta0 is to be proportional to eps0, i.e.
eps0/eps_total = delta0/delta_total => delta0 = eps0/eps_total*delta_total
In compose_distribtuions function let's use from_privacy_parameters to create PLD for (eps0, delta0)-dp mechanism.
The rest of the calibration is the same.
Note: As mentioned there are different ways how to map std -> (eps, delta), which provides different utility tradeoffs between Laplace/Gaussian vs (eps, delta)-dp generic mechanism.
@jspacek FYI I've updated a link how to create PLD from (eps, delta) guarantees (step 3 in description)
Context
The privacy budget accounting is an important feature of the DP aggregation system. On the issue BudgetAccountant was implemented. The BudgetAccountant uses the naive composition, namely the total budget is the sum of budgets:
(๐บ_1, ๐ฟ_1) + (๐บ_2, ๐ฟ_2) = (๐บ_1+๐บ_2,๐ฟ_1+๐ฟ_2).
The downside of using the naive composition is that the total budget is growing linearly with the number of DP aggregations. There are others, more advanced ways of composing budgets, such that the total budget is growing slower than linearly in the number of DP aggregations.
One of those ways is PLD (privacy loss distributions, math details), which is implemented in dp_accounting library.
Goals
Implementation class PLDBudgetAccountant, which uses dp_accounting library.
The idea is that PLDBudgetAccountant collects all requests on budget usage from different DP operations (the same way as BudgetAccountant). request_budget() returns a PLDBudget object which does not have yet noise_parameter. And then when the pipeline is constructed, (i.e. all DP operations are known) compute_budget is called and it computes the minimum noise for each DP aggregation, such that the total budget is not more than (total_eps, total_delta)
Note: for some DP operations it might be more important to get more accurate result, that is why weight parameter is used.
Notes on implementation
Note: In this task we can consider PLD as a blackbox, with the following properties
Using PLD for composing of DP mechanisms
Let input is 0<= delta < 1 and DP mechanisms (M_1, ..., M_n)
Example: M_1 is a gaussian mechanism with std = 2, M_2 is laplace mechanism with std = 1 etc
Goal: to compute minimal eps, such that composition of (M_1, ..., M_n) is (eps, delta)-DP.
With PLD it works the following way:
Note: For PLD the language of (eps, delta) is not natural. There is not much sense to talk about (eps, delta) of a specific mechanism, it's better to think about (eps, delta) of the whole pipeline.
Numerically calibrating noise level
In contrast to the previous subsection, we don't have noise parameters. We would like to find them. Assume that in the setup of the previuos subsection:
we have mechanism (M_1, ..., M_n), with known types (Laplace or Gaussian for now) and a standard deviation of M_i is x/weight_i , where x > 0 is unknown.
The goal is to find minimum x, such that the composition of (M_1, ..., M_n) is (eps, delta)-DP.
Using approach of the previous subsection, for each x > 0 we can compute eps(x). Note, that *eps(x) is decreasing function. So we can do a binary search by x.
Links