Distributed Training with PreMerged model and Deepspeed - Githubissues

arcee-ai / DAM

30 stars 4 forks source link

Distributed Training with PreMerged model and Deepspeed #1

Closed shamanez closed 1 month ago

shamanez commented 2 months ago

We directly load the pre-merged DAM model
Modified the Trainer loss to be compatible with the new Premerged model
Modified the dam.py to solve the device error issues.