[x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
[x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
[x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[x] I have included in the "Environment" section below the output of pip freeze.
[x] I have included in the "Steps to reproduce" section below a minimally reproducible example.
Description
I defined a parameter field_p whose requires_grad is False in my model, and use moving_average in the trainer. Actually, I think the parameter field_p doesn't need to moving average in the training procedure, but the apply method of ExponentialMovingAverage doesn't check the requires_grad property and apply to all parameters.
def apply(self):
....
if num_updates is not None:
decay = min(
self._decay, (self._numerator + num_updates) / (self._denominator + num_updates)
)
else:
decay = self._decay
for name, parameter in self._parameters:
self._shadows[name].mul_(decay).add_((1 - decay) * parameter.data)
If the dtype of field_p is torch.long, this will raise RuntimeError.
result type Float can't be cast to the desired output type Long
File "[/xxxtests/train_local.py]()", line 139, in apply
self._shadows[name].mul_(decay).add_((1 - decay) * parameter.data)
File "[/xxx/ctr/trainer.py]()", line 216, in _train_epoch
self._moving_average.apply(self._total_batches_completed + 1)
File "[/xxx/train_local.py]()", line 249, in train_pipeline
trainer.train()
File "[/xxx/tests/train_local.py]()", line 254, in <module>
train_pipeline()
Checklist
main
branch of AllenNLP.pip freeze
.Description
I defined a parameter
field_p
whoserequires_grad
isFalse
in my model, and usemoving_average
in the trainer. Actually, I think the parameterfield_p
doesn't need to moving average in the training procedure, but theapply
method ofExponentialMovingAverage
doesn't check therequires_grad
property and apply to all parameters.If the
dtype
offield_p
istorch.long
, this will raiseRuntimeError
.Related issues or possible duplicates
Environment
OS: Linux
Python version: 3.7.13
Output of
pip freeze
:``` ```
Steps to reproduce
Example source:
``` ```