DifferentiableUniverseInitiative / horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
https://eng.uber.com/horovod/
Other
0 stars 0 forks source link

Adds grouping to collective ops #1

Closed EiffL closed 3 years ago

EiffL commented 3 years ago

This WIP PR adds the modifications proposed here https://github.com/horovod/horovod/pull/1130 to test this grouping API for the purpose of performing the various collective ops on 2D meshes required by mesh tf