gsurma / atari

AI research environment for the Atari 2600 games 🤖.
https://gsurma.github.io
MIT License
253 stars 81 forks source link

No learning during training #2

Open oldcask opened 4 years ago

oldcask commented 4 years ago

Hi @gsurma,

Thanks for sharing the code, great work there.

I was try to recreate the Breakout model by running the training step. However, even after 7000 runs there seems to be no learning happening. The average score is constant between 1-1.5. However, the accuracy increases and loss decreases.

The code I am using is almost exact clone of the current repo. Could you please let me know if there was any update that was done by you before running the training step for the game?

Tried the same for Pong game and failed to see any learning happening.

Happy to share more details, and any help will be appreciated. Thank you!

gsurma commented 4 years ago

Hi,

Can you share your hyperparams and loss plots? It's hard to debug your issue without them.

oldcask commented 4 years ago

I trying to reproduce similar results to what you have shared. Below is the hyperparameters and loss plots.

Hyperparameters

loss score q step accuracy

Also, should the ClippedRewardsWrapper(env) be used to allow clipping of rewards? It seems to commented in your code.

Thanks.