Closed Kuldr closed 5 years ago
Hey,
This seems odd, are you using a custom Policy class?
As the step function in the base policy for A2C (ActorCriticPolicy(BasePolicy)
) does have the deterministic
keyword argument
EDIT: just noticed the documentation on custom Policies was missing the deterministic
keyword argument and code, documentation is updated now
Hello, as @hill-a pointed out, it seems that some information are missing to help you, please fill the template form completely (especially minimal code to reproduce).
Trying to reproduce your problem, the following code runs:
from stable_baselines import A2C
model = A2C("MlpPolicy", "Pendulum-v0")
env = model.get_env()
obs = env.reset()
model.predict(obs)
model.predict(obs, deterministic=True)
In fact, this feature is tested several times by the CI (for instance here), so it seems you are doing something custom, right?
I am using a custom policy class, which is missing the deterministic
Keyword argument as @hill-a mentioned.
By following the updated docs adding the argument to the step function in my custom policy has solved the issue.
Thanks for your help on this
Describe the bug
Code example This happens when calling model.predict (only tried with an A2C model)
Temporary Fix I have managed to fix this by changing line 367 in common/baseclass.py from `actions, , states, = self.step(observation, state, mask, deterministic=deterministic)
to
actions, , states, _ = self.step(observation, state, mask)`