foundation-model-stack / fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
https://pytorch.org/docs/stable/fsdp.html
Apache License 2.0
162 stars 27 forks source link

More comprehensive dummy token handling #84

Closed daviswer closed 4 months ago

daviswer commented 4 months ago
  1. Expose full set of available dummy tokens in config (bos,eos,bol,eol)
  2. Allow user to specify which dummy tokens to drop from start/end of doc via cfg.strip_tokens. bos,eos,bol,eol are automatically added to this set.
  3. Handle all token stripping in the base reader layer, before user-specified dummy tokens are added