Closed ThomasLecat closed 7 years ago
I have just watched the video created by gym-eval.py : what happens is that the agent kills all the enemies perfectly, but has never learned to surface ! Here's a gif:
Any idea why it stays at the bottom like that ?
Yup takes a long time for it to learn to surface correctly. Remember it took like nearly 24hrs then score started to shoot up. That game took nearly 3days if I remember correctly. Just let it keep training and it will sort itself out. As you can see it can shoot well so once the surfacing skill learned score should rise quickly after that
Got it, thanks ! Back to training then. Do you think increasing the coefficient in front of the entropy term (currently 0.01) in the loss would make the agent discover the surface quicker by encouraging exploration ?
I think its good to leave the entropy term as is as its robust and effective for all games. I think massive exploration is one of the most important advantages of ai to humans so should not be neglected.
Added trained seaquest-v0 model to trained_models folder.
Ok for the entropy, and thanks for the trained model !
I have tried twice to train the agent on Seaquest-v0 with 32 workers on a server, but after 13 hours of training, the score seems to be stuck at 2700/2800 maximum.
Here's the log file : log.txt
I'am using gym 0.8.1 and Atari-py 0.0.21 and let all the hyperparameters to their default value. Any idea why the score obtained is much lower than the one you obtained ? (>50000) Would you have the trained model for Seaquest-v0 ? Thanks !