devsisters / DQN-tensorflow

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning
MIT License
2.48k stars 764 forks source link

breakout-v0 with initial parameter got a terrible performance #33

Open hiahiahu opened 7 years ago

hiahiahu commented 7 years ago

hi, i run this code with the initial parameters: model='m1' with game breakout-v0. after about 8days' gpu training, when the program finished, i evaluated the model and got average-reward 22.0, which has a big difference with your screenshot(score=300+)。

And another experiment only with the switch(duel=True, double_q=True) on, the model's average-reward equals 5.4. Even worse than the original DQN.

Is there any trick i missed ? thanks for your replay!

JUZI1 commented 4 years ago

hi, i run this code with the initial parameters: model='m1' with game breakout-v0. after about 8days' gpu training, when the program finished, i evaluated the model and got average-reward 22.0, which has a big difference with your screenshot(score=300+)。

And another experiment only with the switch(duel=True, double_q=True) on, the model's average-reward equals 5.4. Even worse than the original DQN.

Is there any trick i missed ? thanks for your replay!

i also want to konw it

JUZI1 commented 4 years ago

I wrote a dqn myself, but I also got a bad score. I suspect it's the environmental problem of gym

JUZI1 commented 4 years ago

you can use 'BreakoutNoFrameskip-v0' instead 'Breakout-v0' ,because the environment in 0.17 version gym is diffrent