Open TyJK opened 6 years ago
It is probably because when the epsilon-greedy is less likely to be greedy, the inference takes more time. I have also noticed different game behaves really different. Some games' speeds do not degenerate and others do.
Thanks for the quick response. This isn't an issue so I apologize, but I was wondering how you gathered the average score per episode, since the only output are the 3 csv files. Is it just an average of the episode reward over the number of episodes?
I think I run a few random seeds and compute an average.
Alright, thanks for the confirmation 👍
Sorry (I'm doing this for a reproducibility project based partially on your code), but I just wanted to confirm that the --save-pkl parameter allows you to save the weights of the NN. And finally, how would I get the raw scores for the 30-op evaluation method once it's trained?
Thank you once again. I really appreciate it.
I remember I wrote another code specially for doing 30 no-op test. This code base is quite old, so I do not remember the details. My new implementation was using tensorflow to save and load the model.
Alright, thank you very much :)
Apologies if there is an obvious answer, but from the readme I gathered that when running properly, the steps per second should remain constant throughout training. Running on a GTX 970, I started out with ~90 steps per second and 25% GPU utilization. After leaving it to run overnight, I've found it's only run for 6 epochs and has slowed to about 46 steps per second, with about 15% GPU utilization. Everything runs perfectly otherwise, it takes several hours for the issue to appear, and restarting brings it back up to a normal rate. Is there a known cause/solution for this?
Thank you