Open Ashutosh-Adhikari opened 6 years ago
Hi, is it because of clipping the rewards?
I am not sure whether I get your question correctly. Different games have different rewarding mechanism. Some games have dense reward signal while some games have sparse reward signal (for example in space invaders the agent only get reward when it hit an enemy).
While training, the scale of summed clipped rewards that an agent gets is much higher than what it gets for boxing and much lower for games like qbert and spaceinvaders. Any idea regarding this?