Closed xiangyue9607 closed 4 years ago
It tested torch DataParallel, however, training was slower on two GPUs than on a single GPU.
I think the best way would be to use distributed training. However, it is not yet added to the training script
Thank you!
Hey,
I'm trying the sts training example script (https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/sts/training_stsbenchmark.py).
Any ideas to make the training parallel (on multi gpus)?
Thank you!