hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k stars 725 forks source link

[question] retrieve rewards before the episode is done. #1023

Closed JessicaBorja closed 4 years ago

JessicaBorja commented 4 years ago

We have the Monitor wrapper to get the episode rewards and episode lengths, but sometimes I need to log some "stats" before the episode is done and I was wondering if there's a way to get the rewards per timestep instead.

For instance, I wan't to log training stats, let's say every 100 timesteps. But I'm running episodes without limit hence they can last longer than 100 timesteps, this means that I cannot log the episode_rewards from the monitor because so far no episode has finished. If I had the rewards per timestep or the cummulative reward so far I could log that instead. Is there a way to do this?

araffin commented 4 years ago

Hello, you should take a look at callbacks (cf doc) and gym wrappers (like the monitor).