Closed yangysc closed 5 years ago
Hi, the distributed synced
mode implements Hogwild https://people.eecs.berkeley.edu/~brecht/papers/hogwildTR.pdf , which runs only on CPU. PyTorch GPU does not support the lock-free mechanism needed for Hogwild, see issues:
However, the shared
mode which works with GPU is currently an experimental feature in SLM Lab. Please try the spec python run_lab.py slm_lab/spec/benchmark/a3c/a3c_nstep_pong.json gpu_a3c_nstep_pong train
.
also to answer your question about the performance difference, Hogwild allows the algorithm to make use of more CPUs when GPUs are not available, and it also helps by diversifying the sample trajectories collected across workers. The improvements are training speed and better policy due to diverse training data, at least in theory.
Thanks for your relpy, @lgraesser . Now I understand it.
Describe the bug
Thanks for your excellent library. I think it is the best one in pytorch up to now. I think the ppo algorithm should be the default one to try. So I'm wondering why the dppo is not GPU supported. I thought the distributed version, if combined with gpu supported, would be the best ppo implementation. Could you tell me the performance difference between dppo and ppo (with gpu supported)? I want to make sure which is the best one I should use.
Thanks in advance!.
To Reproduce Run dppo_pong.json in gpu mode
Error logs