kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
https://slm-lab.gitbook.io/slm-lab/
MIT License
1.25k stars 264 forks source link

Soft Actor-Critic #398

Closed kengz closed 5 years ago

kengz commented 5 years ago

Feature / Fix

Roboschool (continuous control) Benchmark

Note that the Roboschool reward scales are different from MuJoCo's.

Env. \ Alg. A2C (GAE) A2C (n-step) PPO SAC
RoboschoolAnt 1153.87
graph
RoboschoolHalfCheetah 1204.68
graph
RoboschoolHopper 1161.24
graph
RoboschoolWalker2d 695.36
graph

LunarLander (discrete control) Benchmark

sac_lunar_t0_trial_graph_mean_returns_vs_frames sac_lunar_t0_trial_graph_mean_returns_ma_vs_frames
Trial graph Moving average
CarloLucibello commented 5 years ago

Hi, just wanted to point out, the follow-up paper by the authors of SAC https://arxiv.org/abs/1812.05905 One of the main differences with the original paper is that they don't use a separate V network

kengz commented 5 years ago

Hi, just wanted to point out, the follow-up paper by the authors of SAC https://arxiv.org/abs/1812.05905 One of the main differences with the original paper is that they don't use a separate V network

@CarloLucibello Thanks for pointing that out. This PR implements the first version of SAC and adds a discrete control version. I'll implement the improved version in the next PR.