Closed kazuki-irie closed 3 years ago
@JesseFarebro
It would be really helpful if you could post a list of actions and the random seed that results in the desired behaviour. This is most likely an issue with score wrapping. If you can provide more info to debug this issue feel free to post it in this thread: https://github.com/mgbellemare/Arcade-Learning-Environment/issues/262 or create a new issue @ https://github.com/mgbellemare/Arcade-Learning-Environment
Hello,
In some Atari games, I observe that well trained models sometimes achieve very high negative scores, such as
-969000
.So far, I've seen this issue for
BattleZoneNoFrameskip-v4
andUpNDownNoFrameskip-v4
.For example in
UpNDownNoFrameskip-v4
, if I evaluate my model on five test episodes, I get the following scores:310760.0
,26270.0
,-919890.0
,364960.0
,156270.0
. where-919890.0
looks buggy to me.Is such a score possible at all? Or is this some bug?
Thank you.