issues
search
HuangLK
/
transpeeder
train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
Apache License 2.0
208
stars
18
forks
source link
Refine dp
#42
Closed
JY-Ren
closed
11 months ago
JY-Ren
commented
11 months ago
fix the dp issue (not effective when dp greater than 1)
add dialog data convert script
remove inv_freq from buffers