Single-node distributed processing with Hydra

briankosw commented 3 years ago

Distributed processing with Hydra in single-node multi-GPU setting, as mentioned here.

[ ] Explain PyTorch's distributed processing/training.
[ ] Simple demonstration of various distributed communication primitives.
[ ] Incorporate Hydra into PyTorch's distributed processing.
[ ] Using multirun to run multiple processes.

This will serve as an introductory example for #38.

briankosw commented 3 years ago

@romesco would love your feedback on this!

romesco commented 3 years ago

Sounds great! What do you think about using the MNIST example as a base? Or did you have something even simpler in mind?

I want to make sure we don't over complicate things on this one. As an example, I would say we can start without using the configs directly since they're somewhat orthogonal from demonstrating how hydra and DDP interact. If you make a draft PR, I'll run everything and provide feedback of course =].

omry commented 3 years ago

I think the idea here is to not actually train but just demonstrate basic primitives.

briankosw commented 3 years ago

Sounds great! What do you think about using the MNIST example as a base? Or did you have something even simpler in mind?

If you check this PR out, you'll see a basic distributed processing setup using Hydra and distributed communication primitives between multiple processes. This is basically as simple as it gets and much simpler than MNIST.

I want to make sure we don't over complicate things on this one. As an example, I would say we can start without using the configs directly since they're somewhat orthogonal from demonstrating how hydra and DDP interact. If you make a draft PR, I'll run everything and provide feedback of course =].

So this PR/example will be about how Hydra helps set up distributed processes without using configs? Should the configs aspect be implemented in the other PR?

I think the idea here is to not actually train but just demonstrate basic primitives.

In that case, I will only demonstrate how Hydra can be used to set up distributed processing.

pytorch / hydra-torch

Single-node distributed processing with Hydra #42