Closed LevineYang closed 5 years ago
Hi @LevineYang , sorry I've been quite busy lately. I think you've understood everything properly. My first remark is that this gif may be misleading: the DQN I trained at that time (long ago) converged to a policy with a high variance in expected return. I showed a sucessful episode, but it actually crashed in about 20% of the episodes. It was still more successful than what you describe though. I've run the code again today and was able to reproduce your results. I changed a few hyperparameters and could obtain similar behaviour as I used to, so I guess this was mainly due to bad refactoring of the configuration files.
Here's an example of successful episode I obtained with current version example_episode.zip
And here is the corresponding tensorboard graph of the policy return over 2000 episodes
Thank you, Edouard! The result can be reproduced by following your suggestion!
Hi Eleurent,
Even I was trying to replicate the gif, and found this issue. However, I am unable to find "baseline.json" configuration for the DQN agent , which has been mentioned in these issues. Maybe the repository has evolved. Can you maybe tell me, which configuration corresponds to the baseline.json now or the example video?
Hi @skynox03 ,
The repository has indeed evolved, and I think that the previous baseline.json
is now dqn.json
. But you will probably get better results with dueling_ddqn.json
or even ego_attention.json
.
Hi Edouard Leurent, it is a great project for Rlers. But now I encountered a problem, when I do "python scripts/experiments.py evaluate scripts/configs/HighwayEnv/env.json scripts/configs/HighwayEnv/agents/DQNAgent/baseline.json --train --episodes=2000", the score slowly converges to about 30 like:
"[INFO] Episode 595 score: 30.1 [INFO] Episode 596 score: 29.8 [INFO] Episode 597 score: 30.7 [INFO] Episode 598 score: 29.1 [INFO] Episode 599 score: 30.3 [INFO] Episode 600 score: 29.9 [INFO] Episode 601 score: 30.7 [INFO] Episode 602 score: 30.5 [INFO] Episode 603 score: 30.0 [INFO] Episode 604 score: 29.0"
But everytime when the video begin to record the vehicle running, it run into another car or cannot accelerate to overtake. So is this baseline.json for the GIF you add in highway-env repo? or I misunderstand something?
I appreciated if you can give me any suggestion. Thank you!