HuangLK / transpeeder

train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
Apache License 2.0
208 stars 18 forks source link

fix dp issue and update convert script #44

Closed HuangLK closed 11 months ago

HuangLK commented 11 months ago
  1. fix the dp issue (not effective when dp greater than 1)
  2. add dialog data convert script
  3. remove inv_freq from buffers