entity-neural-network / incubator

Collection of in-progress libraries for entity neural networks.
Apache License 2.0
29 stars 10 forks source link

Implement data parallelism #218

Closed cswinter closed 2 years ago

vwxyzjn commented 2 years ago

This is using multi-GPU during training? How much performance benefit does it offer?

cswinter commented 2 years ago

It's splits everything across multiple processes/GPUs. In principle, performance can scale close to linearly with the number of GPU. You'll need sufficiently many parallel environments and batch size though. I haven't seen it scale super well in practice yet, possibly there's still some other perf issues that prevent it from scaling well: https://wandb.ai/entity-neural-network/enn-ppo/reports/Data-parallel-perf-test--VmlldzoxODE4NjAx ML matches though: https://wandb.ai/entity-neural-network/enn-ppo/reports/Data-parallelism-test--VmlldzoxODE4MzMx