dgriff777 / rl_a3c_pytorch

A3C LSTM Atari with Pytorch plus A3G design
Apache License 2.0
562 stars 119 forks source link

Performance of Breakout #18

Closed yhcao6 closed 6 years ago

yhcao6 commented 6 years ago

Could I ask how long it takes to train Breakout from scratch to get the desire score (859.57 for Breakout-v3)?

Have You tried BreakoutNoFrameskip? This is a version without repetition and randomness.

Thanks!

dgriff777 commented 6 years ago

Hi @yhcao6 that was different version then v-4 so not so comparable but on Breakout-v4 takes about 7hrs to average over 800reward but it often gets a score over 800 after just an hour in training. Gonna be adding some training graphs shortly for training with new A3G model

yhcao6 commented 6 years ago

Thanks! When I specify the gpu_ids, it will report an error like this:

2018-01-16 9 08 20

So I run in cpu mode, but the performance is also very surprising. Does A3G means A3C-GPU version? I also wonder if you have train the model on some more difficult game like Super Mario? Thanks!

dgriff777 commented 6 years ago

Looks like you are using python2. Have only ran gpu version on python3 sorry hadn't checked if if it ran ok with python2 for gpu version yet. can you run on python 3 or try just commenting the line out and see if it works

#mp.set_start_method('spawn')
dgriff777 commented 6 years ago

Yes A3G means A3C-GPU version. New method I made to do GPU version of A3C that gives you benefits of GPU and CPU to speed training. Much faster for Atari, trains games as fast or faster than neuro-evolution on 700-1400 cpus and overall performance far greater

dgriff777 commented 6 years ago

no haven't tried Super Mario but trained on some harder envs and Atari can be hard if you going for real superhuman like games scores not just mediocre scores seen on most benchmarks but used to success on many real world applications. Also did a continuous action version to solve BipedalWalkerHardcore env. See repo here: a3c_continuous

yhcao6 commented 6 years ago

Thanks! I will have a try