openai / requests-for-research

A living collection of deep learning problems
https://openai.com/requests-for-research
1.69k stars 609 forks source link

Our solution to DQN+RAM #10

Closed sygi closed 8 years ago

sygi commented 8 years ago

A few words including link to our work which describes q-learning on atari RAM.

ilyasu123 commented 8 years ago

Did you need to do anything special in order to get Q-learning working on the RAM atari? Also, any chance you could link to your implementation?

sygi commented 8 years ago

(rewriting the statements from the paper here) In general, the Q-learning worked reasonably (but with big variance) with no changes. In the case of seaquest (and bowling, I think) increasing the frameskip (the number of frames that you repeat the action) caused the score to flat out (with best results) There's a link to the implementation in the paper, but I can put it on the page itself. It is based on https://github.com/spragunr/deep_q_rl, with small changes.

Let me know which of this you'd like to include in the for-page solution.

sygi commented 8 years ago

Ok, it was not the case. The best results for seaquest we got when we applied dropout 0.5. The above is just an interesting observation.

ilyasu123 commented 8 years ago

and did you try DQN on only three Atari games, or on all of them?

sygi commented 8 years ago

We only ran them on the 3 described games (Seaquest, Breakout, Bowling) due to limited computational power we had. I now executed the evaluation on the next 4 (SpaceInvaders, Q*Bert, Enduro, Beam Rider) -- do you have suggestions for more?

ilyasu123 commented 8 years ago

You've run on enough games to make it convincing. Are you able to create a version of your code that also runs on Gym with relatively little effort? This way, other people will be able to run your code on games that you haven't run on yet.

sygi commented 8 years ago

I think so, but I am not sure -- I will try to do it (European-) tomorrow.

On 10 June 2016 at 22:23, ilyasu123 notifications@github.com wrote:

You've run on enough games to make it convincing. Are you able to create a version of your code that also runs on Gym with relatively little effort? This way, other people will be able to run your code on games that you haven't run on yet.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/requests-for-research/pull/10#issuecomment-225286166, or mute the thread https://github.com/notifications/unsubscribe/AAePFK7ewfw3iEBDaD18sv6y8GA8zTUTks5qKce-gaJpZM4IyT-t .

sygi commented 8 years ago

edit: Nvm, I have found the bug in my code.

I have adapted the code to use Gym, but I am getting:

{"timestamps": [], "initial_reset_timestamp": null, "episode_lengths": [], "episode_rewards": []}

in the openaigym.episode_batch.0.13849.stats.json file. The code is here, commit 082e579 (it's a bit messy right now). I'm running it as:

python run_gym.py --env-name Seaquest-ram-v0 --network-type big_ram -e 10 --max-history 100000 --frame-skip 10

Any call to the monitor is in the deep_q_rl/launcher.py file (function launch). The inner evaluation seem to say that the code runs just fine. Do you have a clue what can be a problem here? May it be the case I am saving it to the user's homespace instead of /tmp?

sygi commented 8 years ago

I have changed my code to use Gym and submitted few (4 or 5) evaluations. The gist is here: https://gist.github.com/sygi/8c2e26d692e5dc03ee2f0d68f4395a5c#file-dqn-ram-v0-md

ilyasu123 commented 8 years ago

Are you able to remove ALE as a dependency from the branch? It's a bit hard to install on the mac, and it's also not necessary for the gym solution. Otherwise, LGTM!

sygi commented 8 years ago

Done. It was not used.

ilyasu123 commented 8 years ago

Tried running the code, got "OSError: [Errno 13] Permission denied: '/icm' " (on a Mac OS)

ilyasu123 commented 8 years ago

To be precise:

(tensorflow) Ilyas-MBP:deep_q_rl Sutskever$ python run_gym.py --env-name Breakout-ram-v0 --network-type big_ram -e 100
[2016-06-13 12:26:29,907] Making new env: Breakout-ram-v0
saving evaluation to: /icm/home/sygnowsk/sygi-q-rl/results/Breakout-ram-v0-64
[2016-06-13 12:26:29,932] Creating monitor directory /icm/home/sygnowsk/sygi-q-rl/results/Breakout-ram-v0-64
Traceback (most recent call last):
  File "run_gym.py", line 63, in <module>
    launcher.launch(sys.argv[1:], Defaults, __doc__)
  File "/Users/Sutskever/research/sfwtare/deeq_q_rl_only_ram/deep_q_rl/launcher.py", line 189, in launch
    gym_env.monitor.start(log_path, lambda v: False, force=True)
  File "/Users/Sutskever/research/gym/gym/monitoring/monitor.py", line 101, in start
    os.makedirs(directory)
  File "/Users/Sutskever/tensorflow/bin/../lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/Users/Sutskever/tensorflow/bin/../lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/Users/Sutskever/tensorflow/bin/../lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/Users/Sutskever/tensorflow/bin/../lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/Users/Sutskever/tensorflow/bin/../lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/Users/Sutskever/tensorflow/bin/../lib/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/icm'
sygi commented 8 years ago

Ah, right. Corrected. I had the directory for saving the output hardcoded for something else than /tmp, sorry.

ilyasu123 commented 8 years ago

Great. I am successfully running the code now on Breakout. Any idea of how long training should take on a good macbook?

sygi commented 8 years ago

Probably depends on the GPU card, but I'd guess about 24h. It takes around that on GTX 480. But you can also try a lower number of epochs (e.g.: -e 10) to see if it correctly runs.

sygi commented 8 years ago

@ilyasu123, did you manage to run it/review other evaluations? Thanks for your work!

gdb commented 8 years ago

(You're now live!)