microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.89k stars 344 forks source link

add support to run custom Hf tokenizer for training and dataset pre-processing #421

Closed polisettyvarma closed 3 months ago