Closed NicoRenaud closed 4 years ago
The multiprocessing
route is more difficult than expected. See the mp_sampling
branch
There is a native distriubted
module in pytorch : https://pytorch.org/tutorials/intermediate/dist_tuto.html
Instead of parallelizing the sampler, It might be best to use DistributedDataParallel
to directly parallelize the training (https://pytorch.org/docs/stable/nn.html#torch.nn.parallel.DistributedDataParallel)
We could also split the number of walker in the training function and average the gradients as done there https://pytorch.org/tutorials/intermediate/dist_tuto.html
The average gradients method seems to be the best for us. However a MPI backend would simplify stuff https://medium.com/intel-student-ambassadors/distributed-training-of-deep-learning-models-with-pytorch-1123fa538848
The DistributedDataParallel of pytoch is too hard to use. I'll switch to Hovorod instead
The MC sampling is now done on 1 proc only. This is trivial to parallelize but we could use different options : mpi4py, multiprocess, ....