pranz24 / pytorch-soft-actor-critic

PyTorch implementation of soft actor critic
MIT License
810 stars 180 forks source link

Action scale and action bias #24

Open shakenov-chinga opened 4 years ago

shakenov-chinga commented 4 years ago

Hi guys, You did a great job here! I'm trying to modify algorithms to my need, and I can't quite get two variables in neuron network classes. What are action_scale and action_bias variables, and why do you use it? Could you, please, reference them in the article?

Thanks

gouxiangchen commented 4 years ago

Hi, in GaussianPolicy model, x_t sampled from the normal distribution is passed into a tanh for action bounding, in [-1, 1]. But in practice, the action space may be not in [-1, 1], ( [-2, 1] for example ), now the action_scale = 1.5 and action_bias = -0.5, to rescale [-1, 1] to [-2, 1].