ageron / handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Apache License 2.0
27.98k stars 12.8k forks source link

[Chapter 18] breakout.gif only shows a loop of the agent sitting still as a life is lost #117

Closed EliHarper closed 3 years ago

EliHarper commented 4 years ago

When I try to save the animation of the agent beating Breakout after training, it seems to only show a loop of the agent sitting still while the ball passes it. My logging values output during training look comparable to the outputs provided in the solutions on this Github page.

The thumbnail shows two bricks broken, but when I play the gif, it loops on the frame with header "000 51".

I would be willing to bet that this is due to some issues I've had with conflicting versions of imports. Has anybody else had this issue?

ageron commented 4 years ago

Hi @EliHarper ,

Thanks for your question, and sorry for the late response. Did you manage to find the cause of this issue? If not, could you please provide more details?

lebaste77 commented 3 years ago

I have a similar issue but only for the first loop when played with Windows default viewer, after the first loop, the image is without any broken brick (and score/lives are different). It is maybe only a cache problem ? 1st loop example : image image 2nd and other loops : image image

However I have a global different issue (#397), so this may only be a coincidence.

lebaste77 commented 3 years ago

A video version of the gif shows that the beginning of the created frames is already an end of a game : https://user-images.githubusercontent.com/73541689/109433383-563f8180-7a10-11eb-9355-b7da285f2bad.mp4 (save as)

ageron commented 3 years ago

I believe the issue was caused by the fact that the Blockout game requires the player to press the FIRE button (action 1) at the beginning of the game (and after each life lost) otherwise the ball never appears. The agent may actually take a very long time to learn to do this, because pressing FIRE initially causes the agent to lose faster than not pressing it at all. I tweaked the code to ensure that the FIRE button is pressed automatically at the beginning of the game (and after each life lost), and now the agent learns much faster (it can catch the ball at least a few times after 50,000 training iterations), and the ball is almost always visible (except when the agent is just about to lose). Could you please try updating your code to the latest version, and ensure you have the latest libraries (TensorFlow 2.4 and TF Agents 0.7, see environments.yml for details), and try running the code? Alternatively, you can just run the Colab notebook. Feel free to reopen this issue if the problem persists. Thanks again for your feedback, it really helped. 👍