Open anhnami opened 1 week ago
Good catch for the "strict" part, will make a patch to fix it.
Do you mind explaining a bit on "it does not support several layers"? I believe the current implementation supports all the layers which were previously supported by Opacus GradSampleModule.
It's BatchNorm and customized layers with buffers. I'm hoping the strict option may allow me to use them since my use case is not privacy-related.
I see. I think you can unblock the usage by setting strict = False
. I have never tested it by myself, so had better do some quick test to make sure the gradient norm is consistent (https://github.com/pytorch/opacus/blob/main/opacus/tests/gradient_accumulation_test.py).
Per-sample gradient clipping has recently been reported to be useful for speech processing [1][2][3]. Implementing per-sample gradient clipping is complicated, hence I just want to use opacus to do the job. However, since opacus is privacy-focused, it does not support several layers. Furthermore, it seems we can't turn off "strict" mode in GradSampleModuleFastGradientClipping. It would be nice to support this non-privacy use case.
https://github.com/pytorch/opacus/blob/9eed06a2fc785e94abc05e5eb7ef3ed0a5a5a909/opacus/grad_sample/grad_sample_module_fast_gradient_clipping.py#L113
[1] https://arxiv.org/pdf/2406.02004 [2] https://arxiv.org/pdf/2310.11739 [3] https://arxiv.org/pdf/2408.16204