google / differential-privacy

Google's differential privacy libraries.
Apache License 2.0
3.07k stars 346 forks source link

addGaussianInt64 from scale for C++ #76

Closed degregat closed 3 years ago

degregat commented 3 years ago

Dear privacy team,

I need a function like addGaussianInt64, but in C/C++, for use in a version of DPSGD).

Ideally the interface would expose an additive mechanism taking only an input value and scale at execution, giving a noised value and epsilon (to track with an accountant) as output. L2 sensitivity or stddev and delta are known at initialization.

The closest I could find was AddInt64Noise, but this takes the privacy_budget as an argument.

Is there a way around this? If not, I'd be happy to do a port of the Go function, if it is wanted for the C++ library. If that is the case, should it be integrated in the existing mechanism builder or as a seperate one?

I think it might be useful as a building block for iterated training procedures, but the above would be quite specific. It could be extended to double types and other distributions later though.

In any case, thanks for this amazing library! I learned a lot already from reading the code.

UPDATE: I started work on the wrapper, once it's presentable we can discuss whether generalizing it would make sense :)

dibakch commented 3 years ago

How about using a GaussianMechanism? You can set L2 sensitivity, epsilon, and delta in the GaussianMechanism::Builder to construct the mechanism. Once constructed, you can use AddNoise to noise your int64s (or any other type, but tests only cover int64 and double).

degregat commented 3 years ago

I wanted to do that initially, but with DPSGD only the delta, sensitivity (which is equivalent to the gradient norm bound), and scale are known ahead of time. The total privacy budget spent over all iterations is then calculated by the moments accountant. So I guess in the end, what I want to do is to only sample from the gaussian in a safe way, by following the granularity calculations, safe casts and rounding from GaussianMechanism?

dibakch commented 3 years ago

Yes, we are currently not providing interfaces for PLD in the DP library. This might need some more API changes.

dasmdasm commented 3 years ago

(Correct me if any of this is wrong)

It sounds like you want two separate things:

For the first one, it sounds like you want something very similar to GaussianDistribution.Sample(). The method you want should probably be written by calling it. Note that it's currently package private, for reasons that I'll discuss more below.

For the second one, our accounting library covers that use case. It can probably substitute in fairly easily for your proposed moments accountant.

I'd encourage you to think about whether this is really the interface you want to implement, though. Right now we don't have a way to add noise specified by a standard deviation and figure out the privacy guarantees afterward. That's because we think that's a bad interface. People who use a tool probably want to be reasoning about privacy up-front, rather than after-the-fact. Would your users be better served by an interface where they specify the privacy guarantees up front, and then you derive the noise standard deviation from that?

degregat commented 3 years ago

Thank you for this elaboration, what you have written is spot on! I will see how to best incorporate these pointers.

I also agree that this is not the optimal interface in the long run, but at the moment the resulting codebase will be mostly for research purposes anyways. Figuring out sensible ways of estimating the privacy guarantees one would want from such a system is one of the open problems. An interface as you describe it (and this library provides) would be one of the practical consequences of a result of that kind.

If you want to know more about the context, I have a design draft here. I do not at all expect you to read it, as your answers were quite helpful already, but if you have any feedback, it would be welcome.

Thanks again!

degregat commented 3 years ago

And thanks to @dibakch for committing the C++ accountant :)