coreylynch / async-rl

Tensorflow + Keras + OpenAI Gym implementation of 1-step Q Learning from "Asynchronous Methods for Deep Reinforcement Learning"
MIT License
1.01k stars 174 forks source link

Do results differ only because of the seed? #9

Open danijar opened 8 years ago

danijar commented 8 years ago

You write that one should try experiments with multiple seeds. Did you found that results differ substantially given only different seeds?

I'm asking because in the paper, Mnih. et al. take the best 5 out of 50 runs with different learning rates. However, from the paper it's not clear to me whether the methods are sensitive to the choice of learning rate or instable in general.