arcee-ai / DAM

30 stars 4 forks source link

Distributed Training with PreMerged model and Deepspeed #1

Closed shamanez closed 1 month ago

shamanez commented 2 months ago
  1. We directly load the pre-merged DAM model
  2. Modified the Trainer loss to be compatible with the new Premerged model
  3. Modified the dam.py to solve the device error issues.