feat: support adding attention mask when not present for pretokenized data

foundation-model-stack / fms-hf-tuning

🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.

Apache License 2.0

9 stars 30 forks source link

feat: support adding attention mask when not present for pretokenized data #192

Open kmehant opened 2 weeks ago

kmehant commented 2 weeks ago

When in the case of pretokenized datasets, we should provide the feature on computing attention mask on the go when not part of the provided datasets. This can opinionated and up for discussion.