Open shakenov-chinga opened 4 years ago
Hi, in GaussianPolicy model, x_t sampled from the normal distribution is passed into a tanh for action bounding, in [-1, 1]. But in practice, the action space may be not in [-1, 1], ( [-2, 1] for example ), now the action_scale = 1.5 and action_bias = -0.5, to rescale [-1, 1] to [-2, 1].
Hi guys, You did a great job here! I'm trying to modify algorithms to my need, and I can't quite get two variables in neuron network classes. What are action_scale and action_bias variables, and why do you use it? Could you, please, reference them in the article?
Thanks