kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
https://slm-lab.gitbook.io/slm-lab/
MIT License
1.23k stars 263 forks source link

About target entropy in SAC #467

Open lidongke opened 3 years ago

lidongke commented 3 years ago

Hi~keng I have some problems about SAC-discrete. I found this version code:https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch which has not use Gumbel-softmax, and its target entropy is set as a positive value with -np.log(1.0/acition_space.size()) * 0.98 and the log_alpha will be increase to greater than 1.0 with the update step. But the sac for continuous in this version also use a negative value with -np.prod(acition_space.size()). But in your code, you use Gumbel-softmax and set both discrete and continuous's target entropy with a negative value with -np.prod(acition_space.size()),so the log_alpha will decrease with the update step. I really want to know how can i set the target entropy?Why target entropy in @p-christ 's code is different from you?

https://stackoverflow.com/questions/56226133/%20soft-actor-critic-with-discrete-action-space

@kengz