allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.72k stars 2.25k forks source link

`ExponentialMovingAverage` processes these parameters whose `requires_grad=False` #5626

Closed Zessay closed 2 years ago

Zessay commented 2 years ago

Checklist

Description

I defined a parameter field_p whose requires_grad is False in my model, and use moving_average in the trainer. Actually, I think the parameter field_p doesn't need to moving average in the training procedure, but the apply method of ExponentialMovingAverage doesn't check the requires_grad property and apply to all parameters.

def apply(self):
        ....
        if num_updates is not None:
            decay = min(
                self._decay, (self._numerator + num_updates) / (self._denominator + num_updates)
            )
        else:
            decay = self._decay

        for name, parameter in self._parameters:
            self._shadows[name].mul_(decay).add_((1 - decay) * parameter.data)

If the dtype of field_p is torch.long, this will raise RuntimeError.

result type Float can't be cast to the desired output type Long
  File "[/xxxtests/train_local.py]()", line 139, in apply
    self._shadows[name].mul_(decay).add_((1 - decay) * parameter.data)
  File "[/xxx/ctr/trainer.py]()", line 216, in _train_epoch
    self._moving_average.apply(self._total_batches_completed + 1)
  File "[/xxx/train_local.py]()", line 249, in train_pipeline
    trainer.train()
  File "[/xxx/tests/train_local.py]()", line 254, in <module>
    train_pipeline()

Related issues or possible duplicates

Environment

OS: Linux

Python version: 3.7.13

Output of pip freeze:

``` ```

Steps to reproduce

Example source:

``` ```

github-actions[bot] commented 2 years ago

This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread 👇

Zessay commented 2 years ago

As above