rossumai / keras-multi-gpu

Multi-GPU data-parallel training in Keras
MIT License
77 stars 20 forks source link

a thought #7

Open Duncanswilson opened 6 years ago

Duncanswilson commented 6 years ago

hey guys,

first I wanna say that it's so nice to see people sharing their thoughts and work like this.

I just wanted to ask, wrt. to the Keras distributed tests, are you scaling batch size with the number of gpus? (as Keras just splits the given batchsize, across the cards. so for a batchsize of 256 on 4 cards the real batchsize is 64 per card. (I honestly think this should be changed, but c'est la vie)

so this may be why you see less efficiency on the cards.

here's a plot from my tests, that shows quasilinear speedups on EC2 instances.

pasted image at 2017_11_13 05_31 pm

hope this helps!!