eleurent / rl-agents

Implementations of Reinforcement Learning and Planning algorithms
MIT License
591 stars 153 forks source link

rl-agents compatible with continuous action spaces #52

Open SHITIANYU-hue opened 4 years ago

SHITIANYU-hue commented 4 years ago

I am wondering is cross entropy method the only one that is compatible with continuous action spaces?

I tried CEM agent, but I found it runs very slow(Animation update is very slow), how could I increase the running speed? Thanks😁

eleurent commented 4 years ago

Hi, Yes unfortunately CEM is the only method implemented that handles continuous actions, due to the fact that my own work rather focuses on discrete actions. To increase the running speed, the first step would be to use parallel computing, since CEM is very easy to parallelize. That should help if you have many CPUs available. Another possibility is to use the GPU, see e.g. this tutorial in which a forward dynamical model is trained in pytorch. The advantage is that this model can then be used to forward 100 trajectories in parallel, in a single GPU forward pass, which is much faster (see the CEM reimplementation at the bottom of the notebook).

Another possibility is to use policy gradients algorithms (DDPG, PPO, etc) that handle continuous actions, from other libraries like stable baselines (example).

Finally, I would like to try implementing tree-based planning algorithms with continuous actions (that are more sample-efficient than CEM), like the MCTS with progressive widening or SOPC algorithms, but I wont' have the time in the next few weeks.

SHITIANYU-hue commented 4 years ago

Thanks for your reply and valuable suggestions!

SHITIANYU-hue commented 4 years ago

Hello, I tried the stable baselines with DDPG algorithm, but I found the agent can't learn a reasonable policy. The agent just drives around in a circle. here is the code(I just directly use their packages):

image