araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
https://stable-baselines.readthedocs.io/
MIT License
1.12k stars 206 forks source link

Retrieving Q-values of trained agents. (Question) #74

Closed yotamitai closed 4 years ago

yotamitai commented 4 years ago

Hi there! Really loving the trained models - kudos to you guys.

Is there a way for me to retrieve the Q-values of these trained agents? Meaning - I'd like to obtain the probabilities of choosing each action from a given state. Currently, when setting the stochastic parameter to True, the model.predict method (as shown in enjoy.py) only returns the predicted action.

Is there a configuration I can use in order to obtain the full action probabilities for a given state?

Thanks in advance for your help.

araffin commented 4 years ago

Hello,

Is there a configuration I can use in order to obtain the full action probabilities for a given state?

Please read the documentation of stable-baselines, there is an action_probability() method.

For the q-values: https://github.com/hill-a/stable-baselines/issues/669 for the other algorithms, please take a look at the code of the policies.

yotamitai commented 4 years ago

Great, thanks a lot.

younader commented 3 years ago

Hello @araffin Apologies for this stupid question, but after training a DQN model, I can't invoke the policy as model.policy.step(obs=env.observation_space.sample()) in order to get the Q-values for a certain observation. I keep getting an error: TypeError: step() missing 1 required positional argument: 'self'. Thank you for your time!