Closed AdityaGudimella closed 5 years ago
yes so in the end i want a linear transformation which maps(-1, 1)
to (LOG_STD_MIN, LOG_STD_MAX)
and since this transformation is unique any other form of it will turn out to be the same formula as I have...
The way I, and the original author I suppose, thought of it was to scale (-1, 1)
-+1
-> (0, 2)
- *.5
-> (0, 1)
-> stretch it -> (LOG_STD_MIN, LOG_STD_MAX)
. Hope that helps!
When calculating the scaled log_std in SAC policy, you scale
log_std + 1
to the range[LOG_STD_MIN, LOG_STD_MAX]
. Is this because the range of thetanh
function is[-1, 1]
? Is it really necessary? Wouldn't the scaling limit the output range to[LOG_STD_MIN, LOG_STD_MAX]
even without that?https://github.com/chutaklee/firedup/blob/ed3634525703f3169b190f6e7951d69c38a5372d/fireup/algos/sac/core.py#L92-L93