microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.9k stars 345 forks source link

Fix ParallelMLP and enable accelerator test #403

Closed xinyu-intel closed 5 months ago

xinyu-intel commented 5 months ago
xinyu-intel commented 5 months ago

cc @polisettyvarma