Can't reproduce. Is the reward and penalty rule right?

yenchenlin / DeepLearningFlappyBird

Flappy Bird hack using Deep Reinforcement Learning (Deep Q-learning).

MIT License

6.62k stars 2.04k forks source link

Closed wenlisong closed 5 years ago

wenlisong commented 5 years ago

If i don't change the penalty value, I can't reproduce even after about 6 million steps. Is the reward rule right?