Soft Actor-Critic improvements - Githubissues

kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

https://slm-lab.gitbook.io/slm-lab/

MIT License

1.25k stars 264 forks source link

Soft Actor-Critic improvements #399

Closed kengz closed 5 years ago

kengz commented 5 years ago

SAC improvements

Implement the improvements in the follow-up paper from SAC https://arxiv.org/pdf/1812.05905.pdf
extend to work directly for discrete environment using GumbelSoftmax distribution (custom)
add QMLPNet, QConvNet for Q(s,a) -> q in SAC
add SAC Pong spec, not tuned yet.

This results in better performance over the original SAC benchmark in PR #398 (*note however the Polyak coefficient was off in that PR)

Roboschool (continuous control) Benchmark

Note that the Roboschool reward scales are different from MuJoCo's.

Env. \ Alg.	SAC
RoboschoolAnt	2451.55 graph
RoboschoolHalfCheetah	2004.27 graph
RoboschoolHopper	2090.52 graph
RoboschoolWalker2d	1711.92 graph

LunarLander (discrete control) Benchmark



Trial graph	Moving average