openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.82k stars 8.61k forks source link

Very high negative scores in some of Atari environments #2233

Closed kazuki-irie closed 3 years ago

kazuki-irie commented 3 years ago

Hello,

In some Atari games, I observe that well trained models sometimes achieve very high negative scores, such as -969000.

So far, I've seen this issue for BattleZoneNoFrameskip-v4 and UpNDownNoFrameskip-v4.

For example in UpNDownNoFrameskip-v4, if I evaluate my model on five test episodes, I get the following scores: 310760.0, 26270.0, -919890.0, 364960.0, 156270.0. where -919890.0 looks buggy to me.

Is such a score possible at all? Or is this some bug?

Thank you.

jkterry1 commented 3 years ago

@JesseFarebro

JesseFarebro commented 3 years ago

It would be really helpful if you could post a list of actions and the random seed that results in the desired behaviour. This is most likely an issue with score wrapping. If you can provide more info to debug this issue feel free to post it in this thread: https://github.com/mgbellemare/Arcade-Learning-Environment/issues/262 or create a new issue @ https://github.com/mgbellemare/Arcade-Learning-Environment