dgriff777 / rl_a3c_pytorch

A3C LSTM Atari with Pytorch plus A3G design
Apache License 2.0
563 stars 119 forks source link

Why would you end training episode if any lives are lost? #1

Closed ethancaballero closed 7 years ago

ethancaballero commented 7 years ago

Why is this in train.py ?

            if args.count_lives:
                if lives > info['ale.lives']:
                    done = True
ppwwyyxx commented 7 years ago

https://github.com/muupan/async-rl/issues/9

dgriff777 commented 7 years ago

Just get rid of extra data to speed things up as nothing valuable is learned from agent during transition from life to life. Helps for a few games, mostly for games that time is a factor

ethancaballero commented 7 years ago

Did it help for Breakout?

dgriff777 commented 7 years ago

no it shouldn't as agent needs to learn to fire once new life starts

dgriff777 commented 7 years ago

SpaceInvaders it helps a lot. BeamRider I think it helped too. But any game where agent has to learn to start game up again after lost of life its gonna be detrimental

dgriff777 commented 7 years ago

This is done in Deepminds Alewrap as well as you can see in code here https://github.com/deepmind/alewrap/blob/master/alewrap/GameEnvironment.lua

nina124 commented 6 years ago

Does args.count_lives help Seaquest-v0? How do you get the pre-trained Seaquest-v0 model? run python main.py --env Seaquest-v0 --workers 32 for 3 days?