kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
https://slm-lab.gitbook.io/slm-lab/
MIT License
1.25k stars 264 forks source link

Why does dppo not support GPU #382

Closed yangysc closed 5 years ago

yangysc commented 5 years ago

Describe the bug

Thanks for your excellent library. I think it is the best one in pytorch up to now. I think the ppo algorithm should be the default one to try. So I'm wondering why the dppo is not GPU supported. I thought the distributed version, if combined with gpu supported, would be the best ppo implementation. Could you tell me the performance difference between dppo and ppo (with gpu supported)? I want to make sure which is the best one I should use.

Thanks in advance!.

To Reproduce Run dppo_pong.json in gpu mode

Error logs

[2019-07-13 15:07:44,444 PID:8795 INFO run_lab_script.py read_spec_and_run] Running lab spec_file:slm_lab/spec/benchmark/dppo/dppo_pong.json spec_name:dppo_pong in mode:train
Traceback (most recent call last):
  File "/home/noone/Documents/New_torch/SLM-Lab/run_lab_script.py", line 87, in <module>
    main()
  File "/home/noone/Documents/New_torch/SLM-Lab/run_lab_script.py", line 73, in main
    read_spec_and_run(*args)
  File "/home/noone/Documents/New_torch/SLM-Lab/run_lab_script.py", line 51, in read_spec_and_run
    spec = spec_util.get(spec_file, spec_name)
  File "/home/noone/Documents/New_torch/SLM-Lab/slm_lab/spec/spec_util.py", line 160, in get
    check(spec)
  File "/home/noone/Documents/New_torch/SLM-Lab/slm_lab/spec/spec_util.py", line 98, in check
    raise e
  File "/home/noone/Documents/New_torch/SLM-Lab/slm_lab/spec/spec_util.py", line 95, in check
    check_compatibility(spec)
  File "/home/noone/Documents/New_torch/SLM-Lab/slm_lab/spec/spec_util.py", line 80, in check_compatibility
    assert ps.get(spec, 'agent.0.net.gpu') == False, f'Distributed mode "synced" works with CPU only. Set gpu: false.'
AssertionError: Distributed mode "synced" works with CPU only. Set gpu: false.
[2019-07-13 15:07:44,450 PID:8795 ERROR spec_util.py check] spec dppo_pong fails spec check
Traceback (most recent call last):
  File "/home/noone/Documents/New_torch/SLM-Lab/slm_lab/spec/spec_util.py", line 95, in check
    check_compatibility(spec)
  File "/home/noone/Documents/New_torch/SLM-Lab/slm_lab/spec/spec_util.py", line 80, in check_compatibility
    assert ps.get(spec, 'agent.0.net.gpu') == False, f'Distributed mode "synced" works with CPU only. Set gpu: false.'
AssertionError: Distributed mode "synced" works with CPU only. Set gpu: false.
kengz commented 5 years ago

Hi, the distributed synced mode implements Hogwild https://people.eecs.berkeley.edu/~brecht/papers/hogwildTR.pdf , which runs only on CPU. PyTorch GPU does not support the lock-free mechanism needed for Hogwild, see issues:

However, the shared mode which works with GPU is currently an experimental feature in SLM Lab. Please try the spec python run_lab.py slm_lab/spec/benchmark/a3c/a3c_nstep_pong.json gpu_a3c_nstep_pong train.

kengz commented 5 years ago

also to answer your question about the performance difference, Hogwild allows the algorithm to make use of more CPUs when GPUs are not available, and it also helps by diversifying the sample trajectories collected across workers. The improvements are training speed and better policy due to diverse training data, at least in theory.

yangysc commented 5 years ago

Thanks for your relpy, @lgraesser . Now I understand it.