Open dxqbYD opened 3 weeks ago
There seems to be an issue with the Adafactor optimizer found here, if beta1 is > 0: https://github.com/facebookresearch/fairseq/blob/ecbf110e1eb43861214b05fa001eff584954f65a/fairseq/optim/adafactor.py#L66
Please find a detailed description here: https://github.com/huggingface/transformers/issues/34506
There seems to be an issue with the Adafactor optimizer found here, if beta1 is > 0: https://github.com/facebookresearch/fairseq/blob/ecbf110e1eb43861214b05fa001eff584954f65a/fairseq/optim/adafactor.py#L66
Please find a detailed description here: https://github.com/huggingface/transformers/issues/34506