Grzego / async-rl

Variation of "Asynchronous Methods for Deep Reinforcement Learning" with multiple processes generating experience for agent (Keras + Theano + OpenAI Gym)[1-step Q-learning, n-step Q-learning, A3C]
MIT License
44 stars 12 forks source link

Update train.py #2

Open pavitrakumar78 opened 7 years ago

pavitrakumar78 commented 7 years ago

added python 2.x compatibility added workaround for breakout and space invaders to correct the number of moves Note: this workaround just tells the acting methods that the move space will be 4 instead of the default 6. So, the network will be making a prediction amongst 0,1,2,3 which corresponds to ['NOOP', 'FIRE', 'RIGHT', 'LEFT'] - thus, we ignore the RIGHTFIRE and LEFTFIRE which are not needed for breakout.

Grzego commented 7 years ago

I think it's useful to have Python 2 support but workarounds for action_space are not necessary. If you remove them I will merge this PR. Thanks! :)

pavitrakumar78 commented 7 years ago

That is just one line on the top! :+1:
I removed the workarounds. You can test the code again if you want! :)