EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
https://www.eleuther.ai/
Apache License 2.0
6.96k stars 1.02k forks source link

SFT improvements (labeling fixes, different packing implementations) #1240

Closed dmahan93 closed 3 months ago

dmahan93 commented 5 months ago