pytorch / opacus

Training PyTorch models with differential privacy
https://opacus.ai
Apache License 2.0
1.65k stars 328 forks source link

ValueError: Per sample gradient is not initialized. Not updated in backward pass?Need solution #561

Closed Hafizamariaiqbal closed 1 year ago

Hafizamariaiqbal commented 1 year ago
          Yes. Sometimes you need large batch sizes to make your model converge, but the GPU memory might be too small to fit all the per-sample gradients. This is why we distinguish the two: the physical batch size is what your GPU can fit, but the actual batch size is what you need for your optimization. Typically, if you have batch size of 512 and physical batch size of 32, you will do forward/backward on physical batches of size 32, but optimizer.step() will do an actual step only once every 16 (=512/32) forward/backward.

Originally posted by @alexandresablayrolles in https://github.com/pytorch/opacus/issues/502#issuecomment-1243618518

alexandresablayrolles commented 1 year ago

Closing issue as duplicate of #560