google / differential-privacy

Google's differential privacy libraries.
Apache License 2.0
3.08k stars 353 forks source link

Implementing a custom noisy threshold #212

Closed ssabdb closed 1 year ago

ssabdb commented 1 year ago

I am trying to use the Confident GNMax-PATE mechanism from SCALABLE PRIVATE LEARNING WITH PATE.

This enhanced variant of PATE computes an rdp for a noisy threshold, but what's important is that it is dependent on the data going in.

The only parameter available to me is the noise_multiplier for a GaussianDPEvent, which I think is targeted at the GaussianSampledMechanism.

What would be the expected way of implementing this? I suspect the implementation of another NoisyThresholdGaussianDPEvent - probably inheriting from DPEvent - but the changes required to the already complex _maybe_compose in the RDPAccountant which seems quite intrusive.

At the moment I'm just extending the RDPAccountant and implementing a manual_compose function which allows me to manually compose for a given range of orders, but I don't think this is intended either.

pritkamath commented 1 year ago

Thanks for your question. While it might be possible to define a DpEvent for this mechanism and have the RdpAccountant support it, given the specialized nature of the mechanism, we don't have it on our roadmap to support such a DpEvent.

There are two approaches that we can see that you could use with the library as it exists today:

  1. (Simple, but loose) You could simply compose the two steps in Confident-GNMax Aggregator, each of which corresponds to a GaussianDpEvent. This approach does not exploit the data-dependent accounting given in the paper.

  2. (Complex, but tighter) Write a standalone method to compute the RDP bound for any given order, as given in the paper, and use the compute_epsilon method in the library to translate the RDP bound to $(\varepsilon, \delta)$; bypassing the need to create any DpEvent.

We hope this helps. Please let us know if you would like any further clarification on the same.

miracvbasaran commented 1 year ago

Closing this issue, please feel free to reopen if you have any questions & concerns!

ssabdb commented 1 year ago

Thanks for the response, and apologies for the lack of response over the holiday season.

I've decided to implement #2 and it works well enough, thankyou. I assume there's no way of "injecting" an already known/computed DpEvent for a known privacy so I can continue to use some other features of this library?

Understood the decision not to support specialized DpEvent mechanism like this one, but the ability to inject a known, precomputed privacy loss feels like it could be useful in other similar situations.

Otherwise, thankyou for the pointers.