allenai / OLMo

Modeling, training, eval, and inference code for OLMo
https://allenai.org/olmo
Apache License 2.0
4.37k stars 431 forks source link

Add support for document masking during training #661

Closed epwalsh closed 1 month ago

epwalsh commented 1 month ago

Adds support for document masking during training via flash-attn. This is activated when the flag --data.generate_doc_lengths is set. The code changes were adapted from https://github.com/yuzhaouoe/pretraining-data-packing.