Closed filopedraz closed 5 months ago
Did you use Deepspeed to train the model?
The backend is simply FSDP, https://github.com/LLM360/amber-train/blob/main/main.py#L129
Did you use Deepspeed to train the model?