Possible error in the calculation of the log probabilities in SAC

https://github.com/openai/spinningup/blob/038665d62d569055401d91856abb287263096178/spinup/algos/pytorch/sac/core.py#L53-L67

In line 64 of the code above, the squashing function (tanh) is applied to pi_action, producing an action pi_action which lives in the interval [-1,1]. In line 60, logp_pi is set to the corresponding log probability of such action. However, in line 65 pi_action is multiplied by self.act_limit, which is a scaling factor. Now pi_action lives in the interval [-self.act_limit,self.act_limit]. According to the change of variables formula, also the corresponding logp_pi should change. Assuming that self.act_limit > 0, I think that the correct result would be obtained by adding, afterline 60, the following: logp_pi -= np.log(self.act_limit)

openai / spinningup

Possible error in the calculation of the log probabilities in SAC #314