ReaLLMASIC / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
23 stars 17 forks source link

Fix multigpu training for train.py script #149

Closed gkielian closed 3 months ago

gkielian commented 4 months ago

This should fix the compatibility of torchrun with train.py

gkielian commented 3 months ago

Simpler fix found in latest pr: https://github.com/ReaLLMASIC/nanoGPT/pull/170

Fixing this line plus adding gradient_accumulation_steps flag 16 does the trick