hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.15k stars 723 forks source link

DDPG agent with Realtime environment #252

Closed RmKuma closed 5 years ago

RmKuma commented 5 years ago

[question] Hi, I'm trying to learn DDPG agent(stable-baselines) with car simulator named Torcs as environment.

Observation is 64 x 64 size frame and make model like this. model = DDPG(LnCnnPolicy, env (<- TorcsEnv), verbose = 1, action_noise = OUaction_noise)

Then, the problem is that torcs is realtime simulator, Env needs action at every frame. But after 100~ steps, ddpg agent don't make any action. because of this, my car get out of the track then episode is finished.

My questions are "is there any issue that ddpg agent can't work properly at REALTIME environment?" and "If i can't use ddpg at my case, is there any other solution?"

araffin commented 5 years ago

Hello, From what I remember, DDPG (and other algorithms like SAC) is slow to train with images (even with a GPU). I would recommend you to take a look at that approach which first learns an autoencoder to compress the information, and also train SAC (or DDPG) after each episode (and not after n_steps).

Note: by default, nb_rollout_steps=100, that's why it freezes a bit after 100 steps, because it is training (cf doc)

RmKuma commented 5 years ago

Thank you for your kind and good reply. I solve this problem by using SAC and now i try to study about VAE. I think it will help to increase my agent's performance.

It is helpful for me.

Thanks.