pytorch / opacus

Training PyTorch models with differential privacy
https://opacus.ai
Apache License 2.0
1.65k stars 328 forks source link

No documentation at all #557

Closed zedoul closed 1 year ago

zedoul commented 1 year ago

📚 Documentation

https://opacus.ai/api/privacy_engine.html

Hi,

I am considering to compare the performance of Opacus with that of TensorFlow Privacy.

Opacus, unfortunately, when it comes to documentation, has lots of things to be urgently improved.

For example, PrivacyEngine (https://opacus.ai/api/privacy_engine.html), which is supposed to the most important function on Opacus has literally no documation at all.

In case of TensorFlow, on the other hand, very intuitive to use with extensive and well-described documentation in professional way.

Can you share potential Opacus users that if 1) your team has any near future plan to elaborate its documentation at least little bit more? 2) or what exact meaning of noise_multiplier and max_grad_norm parameters?

Thank you.

ffuuugor commented 1 year ago

Hi, thanks for reporting

I think we have an issue with the website rendering at the moment - in-code documentation is available and should be in a rather good state. For example, for PrivacyEngine: https://github.com/pytorch/opacus/blob/main/opacus/privacy_engine.py

We'll look into website rendering

zedoul commented 1 year ago

Thank you for the response. I have checked the comments but I still have some questions.

For example, "noise_multiplier - noise_generator: torch.Generator() object used as a source of randomness for the noise" does really mean anything actually. In case of TensorFlow, they use notions that are directly indicating DP formula, e.g., delta and gamma, so that anyone can understand what they are doing without mistakes. Their documentation is also thoroughly described.

It would be much appreciated if anyone working at Opacus explain how "noise_multiplier" could be related to DP formula.

ffuuugor commented 1 year ago

First, please note that noise_multiplier and noise_generator are two different arguments: former defines the scale of the noise, the latter defines the source of randomness (important for deterministic test runs or secure RNG stuff)

From .make_private() docs:

noise_multiplier: The ratio of the standard deviation of the Gaussian noise to
                the L2-sensitivity of the function to which the noise is added
                (How much noise to add)

 noise_generator: torch.Generator() object used as a source of randomness for
                the noise

As to the DP formula, noise_multiplier is the direct equivalent of sigma in the original DP-SGD algorith from Abadi et al paper:

Screenshot 2023-02-07 at 14 19 56

If you need direct equivalence with epsilon and delta, you can use get_noise_multiplier method provided in utils package.

I will now close the issue, as the documentation bug reported here has been fixed.

If you have any further questions or need any help with DP training, the best place to ask them is the Opacus section on PyTorch forums: https://discuss.pytorch.org/c/opacus/29