TianhongDai / hindsight-experience-replay

This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
MIT License
402 stars 75 forks source link

Why MPI.sum in sync_grad (utils.py) #4

Closed sritee closed 4 years ago

sritee commented 5 years ago

Why do you sum rather than average the gradients in sync_grads? Won't this result in different learning rates when you run different number of processes?

TianhongDai commented 5 years ago

@sritee Yes, It will only result in different learning rates. Because I have tried it with both sum and average. I found sum can achieve better results. From my own opinion (maybe not correct) - when we sum gradients from each MPI workers, we can get "strong" update direction (you can also think it's a process of denoising). In this case, we can use "large" learning rate to accelerate the training.