investigate 4-gpu training of LLaMa

ArthurMinovsky commented 1 year ago

(https://twitter.com/realSharonZhou/status/1693744954143904102) (https://huggingface.co/learn/nlp-course/chapter5/4)

จาก tweet ไม่ใช้ lora ก็สามารถเทรนได้

ตรวจสอบว่าโค้ดที่เทรนมีส่วนที่เป็นปัญหาต่อการเทรน multigpu หรือไหม
ลอง implement ตามทวีตว่า เมื่อเปลี่ยน config แล้ว มีอะไรเปลี่ยน

boss-chanon commented 1 year ago

ลิงก์ tweet ต้องเปฺ็นลิงก์นี้รึเปล่าครับ (https://twitter.com/WenhuChen/status/1691846522462216372) ที่ส่งมาให้เหมือนจะเป็น tweet เกี่ยวกับ dataset the pile อะครับ

ArthurMinovsky commented 1 year ago

2 node , batch size 1 , seq-len 1024 , 8 gpu is avaliable to train LLaMA 7 B

OpenThaiGPT / openthaigpt-pretraining

investigate 4-gpu training of LLaMa #295