Add logic to distinguish between decoder-only and encoder-decoder models when passing inputs
Breaking Changes
None intended
Checklist before submitting final PR
[x] My PR is minimal and addresses one issue in isolation
[x] I have merged the latest version of the target branch into this feature branch
[x] I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
[x] I have run a sample config for model training
[ ] I have checked that all tests run through (python tests/tests.py) (apparently some tests were failing already in upstream? also didn't test multi-GPU)
[ ] I have updated the internal changelog (CHANGELOG_DEV.md)
What does this PR do?
This PR adds support for pretrained LongT5 encoder-decoder models from HuggingFace.
General Changes
Breaking Changes
Checklist before submitting final PR
python tests/tests.py
) (apparently some tests were failing already in upstream? also didn't test multi-GPU)CHANGELOG_DEV.md
)