muupan / async-rl

Replicating "Asynchronous Methods for Deep Reinforcement Learning" (http://arxiv.org/abs/1602.01783)
MIT License
401 stars 83 forks source link

[WIP] Fixups #14

Open BlGene opened 8 years ago

BlGene commented 8 years ago

This is a collection of my fixups.

BlGene commented 8 years ago

Hi,

Thanks for posting this!

I was wondering if you were going to push a updated version? I would be intersted to know if running on GPU substantially increases performance for Atari. Also, is running 36 processes is always better than running 16, I don't recall the paper addressing this.

(If you have any ideas for how to improve the code that you haven't had time to try yourself I would be interested in these.)

BR, Max

muupan commented 8 years ago

Thanks for nice fixes! I'll merge it after checking.

I was wondering if you were going to push a updated version?

Yes, I have had some refactoring and implemented training for gym environments and continuous tasks (not so successful so far), but I don't have enough time to push them.

I would be intersted to know if running on GPU substantially increases performance for Atari

I'm also interested in it, but using GPU would be tricky for my multi-process implementation. I suppose there's no way to share GPU memory among different processes.

Also, is running 36 processes is always better than running 16, I don't recall the paper addressing this.

I didn't compare scores of 16 vs 36. My implementation is apparently slower than DeepMind's, and in order to complete the same number of training steps in one day I needed to use more processes.

BlGene commented 8 years ago

Hi @muupan,

I updated the PR to fix the fact that all processes were starting with the same random number ( and a bit more).

BR, Max