After replacing BatchNorm by GroupNorm with ModuleValidator.fix(), the gradients of parameters become None and the grad_sample are also None. The gradient is not flowing to the replaced GroupNorm weights when running backward pass.
for name, layer in model.named_parameters():
if layer.grad is None:
print(name) # the layer changed by ModuleValidator.fix()
print(layer.grad_sample) # None
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
PyTorch Version (e.g., 1.0):
OS (e.g., Linux):
How you installed PyTorch (conda, pip, source):
Build command you used (if compiling from source):
Thanks for raising the issue. I need more information on the other classes (Encoder, Decoder,..), or at least the module information around the error of group_norm.
🐛 Bug
After replacing BatchNorm by GroupNorm with ModuleValidator.fix(), the gradients of parameters become None and the grad_sample are also None. The gradient is not flowing to the replaced GroupNorm weights when running backward pass.
Here is the model.
Please reproduce using our [template Colab]() and post here the link
To Reproduce
Expected behavior
Environment
Please copy and paste the output from our environment collection script (or fill out the checklist below manually).
You can get the script and run it with:
conda
,pip
, source):Additional context