ai-forever / ru-gpts

Russian GPT3 models.
Apache License 2.0
2.08k stars 444 forks source link

How to use data parallelism #61

Closed drunkinlove closed 3 years ago

drunkinlove commented 3 years ago

Is it possible to train with DP?

MSDProj commented 3 years ago

го

king-menin commented 3 years ago

Model is training with data parallelism. In sh file you can specify NUM_GPUS_PER_WORKER=1 parameter. For example for use 16 gpu just set to 16.