Bug in the gym_env module and trackers

michaelnny / deep_rl_zoo

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.

Apache License 2.0

104 stars 11 forks source link

Bug in the gym_env module and trackers #7

Closed michaelnny closed 1 year ago

michaelnny commented 1 year ago

The wrapper which collect 'raw_reward' in the info dict should be applied after frame skip and frame stack. As current code is applied before frame skip and frame stack, thus it will re-count the same reward multiple times.

The issue could be reproduced by running any agent on the Atari Pong game. The expected episode returns should be around -20 or -21 when starting out, however with current code, we get some random values for the episode return.