Open Darius-H opened 2 years ago
This is a pretty serious bug. E.g. https://github.com/cleverhans-lab/cleverhans/blob/master/tutorials/torch/cifar10_tutorial.py does not work with norm == 2 and adversarial training turned on. The loss.backward()
die with something like
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [128, 3, 32, 32]], which is output 0 of MulBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
I'd fix it except, you know, my six month old PR for another bug hasn't been looked at :unamused:.
cleverhans.torch.utils.clip_eta will cause GPU memory leaking for its in-place operation when param norm==2