[Bugfix] Fix QDropout - Githubissues

ucbrise / actnn

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

MIT License

196 stars 30 forks source link

Closed cenyk1230 closed 3 years ago

cenyk1230 commented 3 years ago

Hi, I find some bugs of my early implementation when using QDropout.

In the backward, the gradient should also be divided by the 1-p factor.
In the validation step (self.training = False), we can directly use the forward of nn.Dropout since dropout performs different in training and validation steps.

Please help check the modification.