Open Middlefun opened 5 years ago
double_dqn and seq_eps were both False when I get 20000 scores. The performance becomes worse with any of them set to be True. The model 1545678840.187 in the repo took me about 4 or five hours to train (18700 episodes) it actually depends on luck, but with no modification it should get to at least above 13000 scores easily in testing. I've never tried training with random initial state with the code in the repo, but I think that should works almost the same. Updating tensorflow made the performance better during my test. I'm using tensorflow 1.12 from pip. I'm working on a c++ remake of this btw. I'll push the code later.
Thank you very much for your quick answer. But one more quedtion: besides random initial state, I change the addNumber rate to 0.5 instead of 0.May it be the reason for the low scores I get?Will the game be more difficult with 0.5?
Oh,sorry.The rate of addNumber is 0.1 initialy.
I just tried that and it's really a lot harder ~4000 in average after more than 170000 episodes
Thank you very much for your kindly help.
oops a typo it's 17000 episodes not 170000
okay
I edited you code to play on an new enviroment which has random initial state instead of a given state .what's more, what's more, the addNumber 's rate is different. when I begun to train ,I found the model get about 2100 scores in about half an hour. After that , I trained for 4 hours ,however, the score is still about 2100. Is that OK? Or it's already overfit? And I have some more questions: