keiohta / tf2rl

TensorFlow2 Reinforcement Learning
MIT License
464 stars 104 forks source link

Automatic temperature parameter tuning on SAC #49

Open keiohta opened 4 years ago

keiohta commented 4 years ago

Implement automatic tuning of temperature parameter of entropy and reproduce results from Soft Actor-Critic Algorithms and Applications.

keiohta commented 4 years ago

https://github.com/keiohta/tf2rl/commit/f102b2b9ac3caf5e925917abdab4a93eb6455771 tunes temperature parameters on SAC and SAC-discrete, but does not work well on SAC-discrete. We might need to search optimal target entropy.