to mess with the test, change "for episode in range(x)", and
replace_target_iter=2000,
memory_size=100000,
in the DEEPQNetwork intitialization
I would like to have a better understanding of how replace_traget_iter effects the outcome. I have found a winrate of ~40% which is right around what you would expect if you were playing the "correct" way. I am going to let this run over night and see if my winrate improves.
to mess with the test, change "for episode in range(x)", and replace_target_iter=2000, memory_size=100000, in the DEEPQNetwork intitialization
I would like to have a better understanding of how replace_traget_iter effects the outcome. I have found a winrate of ~40% which is right around what you would expect if you were playing the "correct" way. I am going to let this run over night and see if my winrate improves.