This PR implements loss masking functionality by introducing a collator function wrapper and extends the apply_chat_template functionality.
General Changes
wrapping the collate function with a loss masking collate function
Breaking Changes
For apply chat template we changed the way the information of mask tokens is propagated. Instead as first line within the data file, we hash now the apply_chat_template config and append the hash as suffix to the data files.
Checklist before submitting final PR
[X] My PR is minimal and addresses one issue in isolation
[X] I have merged the latest version of the target branch into this feature branch
[X] I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
[X] I have run a sample config for model fine-tuning
[X] I have checked that all tests run through (python tests/tests.py)
What does this PR do?
This PR implements loss masking functionality by introducing a collator function wrapper and extends the apply_chat_template functionality.
General Changes
Breaking Changes
Checklist before submitting final PR
python tests/tests.py
)