Closed hokhay closed 2 years ago
I tried to use SAC and hit the same problem. I am not an expert so I don't know if there is a work around, but on the surface it seems that it can not be used.
This is a useful table: https://stable-baselines3.readthedocs.io/en/master/guide/algos.html https://stable-baselines3.readthedocs.io/en/master/modules/sac.html
It seems baselines3 implementation of SAC only supports Box (continuous) action spaces. Converting the discrete action space to a continuous action space I don't think would work very well unfortunately, since a continuous action space outputs weights for every action option. Hope this helps! My comment of supporting all of those models out of the box is incorrect, sorry!
To build on the previous comments: in agent_policy.py
there is self.action_space = spaces.Discrete(...)
where SAC requires space.Box(...)
for the action space.
A continuous action space is made for commands such as steering_angle
, gas_pedal
where values between in [0, 1]
can be mapped to a command to apply.
In this game the actions are discrete (up, down, left, right, build city/spawn worker, etc.). To use SAC one would have to 1) change the output space to a Box
and 2) create a wrapper that can transform a value from [0, 1]
to an action.
This seems like a hassle but not impossible, for example do nothing
and the 4 directions could be inferred from a value. Such as: value in [0, 0.2[
is 'go up', value in [0.2, 0.4[ is 'go right'
, etc.
I am trying to use SAC algorithm to doing training. When I implement the SAC model, I got error and realized that it requests "box" action space instead of discrete action space.
I saw comment from Kaggle saying that it is suppose to run any of A2C, DDPG, DQN, HER, PPO, SAC, or TD3 right out of the box, so am I missing something important here?
Thanks Jason