We have the Monitor wrapper to get the episode rewards and episode lengths, but sometimes I need to log some "stats" before the episode is done and I was wondering if there's a way to get the rewards per timestep instead.
For instance, I wan't to log training stats, let's say every 100 timesteps. But I'm running episodes without limit hence they can last longer than 100 timesteps, this means that I cannot log the episode_rewards from the monitor because so far no episode has finished. If I had the rewards per timestep or the cummulative reward so far I could log that instead. Is there a way to do this?
We have the Monitor wrapper to get the episode rewards and episode lengths, but sometimes I need to log some "stats" before the episode is done and I was wondering if there's a way to get the rewards per timestep instead.
For instance, I wan't to log training stats, let's say every 100 timesteps. But I'm running episodes without limit hence they can last longer than 100 timesteps, this means that I cannot log the episode_rewards from the monitor because so far no episode has finished. If I had the rewards per timestep or the cummulative reward so far I could log that instead. Is there a way to do this?