Asynchronous SAC - Githubissues

Feature / Fix

add and tune spec for Asynchronous SAC which is naturally enabled in SLM Lab

SAC has a very low FPS (frame-per-second), which is expected since it involves training 3 networks and syncing 2 networks at every step, though it is sample efficient. However, running SAC at millions of frames quickly becomes a problem since it would take weeks if not months to finish. For example, to run 100M frames at 10 FPS would take 115 days.

Fortunately in SLM Lab any algorithm can directly use Hogwild to parallelize asynchronously. There's of course some tradeoff between the number of workers (time savings) and performance. A more comprehensive study will be done later, but this benchmark below show asynchronous SAC works.

The non-humanoid environments are just for baseline comparison with the non-async version of SAC. The humanoid environments are the ones that would have taken weeks to run on SAC serially, but by parallelizing they each took only a day.

The frames in the graphs are per worker, and graphs are averaged across workers. To get the total frames, simply multiply the x-axis with the number' of sessions (workers).

Env. \ Alg.	A3C (GAE)	A3C (n-step)	Async PPO	Async SAC
RoboschoolAnt				2525.08 graph
RoboschoolAtlasForwardWalk				1849.50 graph
RoboschoolHalfCheetah				2278.03 graph
RoboschoolHopper				2376.96 graph
RoboschoolInvertedDoublePendulum				8030.81 graph
RoboschoolInvertedPendulum				966.41 graph
RoboschoolInvertedPendulumSwingup				847.06 graph
RoboschoolReacher				19.73 graph
RoboschoolWalker2d				1386.15 graph
RoboschoolHumanoid				2458.23 graph
RoboschoolHumanoidFlagrun				2056.06 graph
RoboschoolHumanoidFlagrunHarder				267.36 graph

kengz / SLM-Lab

Asynchronous SAC #404

Feature / Fix