Extracting additional variables from the environment?

tensorflow / agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Apache License 2.0

2.77k stars 714 forks source link

Extracting additional variables from the environment? #581

Open JohnBurden opened 3 years ago

JohnBurden commented 3 years ago

Hi,

I have been adapting the DQN tutorial file for a.custom environment. It seems to learn fine, however I have an additional metric that I want to extract and plot that isn't reward. It corresponds to "safety" in this environment. I'm wondering if there is a standard way of extracting this from the evaluation stage. Do I need to extend the TimeStep class or can I do something else?

Thanks.

kbanoop commented 3 years ago

Yes one option is to extend the TimeStep class, e.g. you can add an info field. Or you can add a property in the environment that stores the last value of safety every time you make a step.