xbpeng / DeepMimic

Motion imitation with deep reinforcement learning.
https://xbpeng.github.io/projects/DeepMimic/index.html
MIT License
2.27k stars 484 forks source link

Is DeepMimic be trained using A3C or A2C? #81

Open Zju-George opened 5 years ago

Zju-George commented 5 years ago

A3C: aka Asynchronous Advantage Actor Critic

It uses MPI, so I wonder if DeepMimic be trained using A3C?

xbpeng commented 5 years ago

Neither, we are using PPO for training. The implementation with MPI is using synchronous updates, so it's more akin to A2C.