NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

How to do multi-node training with multi-GPU? #306

Closed GabrielLin closed 5 years ago

GabrielLin commented 5 years ago

Hi, could you please tell me how to do multi-node training with multi-GPU? Thanks.

borisgin commented 5 years ago

Take a look : https://nvidia.github.io/OpenSeq2Seq/html/installation.htm l More details: https://nvidia.github.io/OpenSeq2Seq/html/distr-training.htm

GabrielLin commented 5 years ago

@borisgin thanks for your answer. But I meant multi-machine, those two links only using multi-GPU at one machine.

borisgin commented 5 years ago

https://github.com/uber/horovod/blob/master/docs/running.md

GabrielLin commented 5 years ago

I think it should be some additional configs for the project and it must not an easy task. Should you share your configuration experience.

borisgin commented 5 years ago

Multi-node is supported only with Horovod, no need to change config file