rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.5k stars 553 forks source link

The policy loss in SAC? #38

Closed yingnan-rl closed 5 years ago

yingnan-rl commented 5 years ago

thanks for your code,it helps a lot. I wonder why you use a different policy loss in SAC. 2019-03-28 19-54-25屏幕截图 In the origin paper, the policy loss is simple. What does the mean_reg_loss and other loss mean?Hoping for your reply.

vitchyr commented 5 years ago

This was based on an old version of SAC. You can set the policy_pre_activation_weight and the reg_weights to zero to get the (updated) SAC loss. I'm planning on removing this in the future, and I'll make this as closed. Feel free to follow-up with other questions.