madras-simulator / MADRaS

Multi-Agent DRiving Simulator
GNU Affero General Public License v3.0
89 stars 20 forks source link

DDPG is not training #18

Closed bardas closed 5 years ago

bardas commented 5 years ago

I have run about 1000 episodes and the car does not learn to drive. I am running the algorithm with the default params. Also, I have turned the train indicator to 1. Do I have to change something else? Do I miss something?

rudrasohan commented 5 years ago

Hi @bardas , can you specify branch you are using and also which ddpg you are using, pid or behavior_reflex.

bardas commented 5 years ago

@rudrasohan I am using behavior_reflex

rudrasohan commented 5 years ago

And what is the branch that you are using master or Version 1?

bardas commented 5 years ago

@rudrasohan Version 1

mateuszkupper commented 5 years ago

Hi, it's actually not working for me either. The agent does not learn anything and weirdly the first episode is usually the best one. I have also not changed anything in the code. I am using the Version 1 branch.

Thanks

buridiaditya commented 5 years ago

You could use master which works with openai baselines. It trains well for me. Ran DDPG for 1e7 time steps. Use the reward function as just this sp * np.cos(obs['angle']) at the mentioned location. https://github.com/madras-simulator/MADRaS/blob/25571bc072b070b9bf41b37d7bdaf54fb6e6f6ac/MADRaS/utils/gym_torcs.py#L141

bardas commented 5 years ago

You mean just doing this progress = sp np.cos(obs['angle']) instead of progress = sp np.cos(obs['angle']) - np.abs(sp np.sin(obs['angle'])) - sp np.abs(obs['trackPos'])?

buridiaditya commented 5 years ago

Yes

bardas commented 5 years ago

after how many episodes did it converge? Also, you are running it without changing the parameters you are providing in your code?

buridiaditya commented 5 years ago

Don't remember the no of episodes for convergence. Exact same parameters mentioned in code.