ShibiHe / Q-Optimality-Tightening

This is my implementation of the Optimality Tightening
MIT License
37 stars 8 forks source link

Fewer steps per second as training progresses #3

Open TyJK opened 6 years ago

TyJK commented 6 years ago

Apologies if there is an obvious answer, but from the readme I gathered that when running properly, the steps per second should remain constant throughout training. Running on a GTX 970, I started out with ~90 steps per second and 25% GPU utilization. After leaving it to run overnight, I've found it's only run for 6 epochs and has slowed to about 46 steps per second, with about 15% GPU utilization. Everything runs perfectly otherwise, it takes several hours for the issue to appear, and restarting brings it back up to a normal rate. Is there a known cause/solution for this?

Thank you

ShibiHe commented 6 years ago

It is probably because when the epsilon-greedy is less likely to be greedy, the inference takes more time. I have also noticed different game behaves really different. Some games' speeds do not degenerate and others do.

TyJK commented 6 years ago

Thanks for the quick response. This isn't an issue so I apologize, but I was wondering how you gathered the average score per episode, since the only output are the 3 csv files. Is it just an average of the episode reward over the number of episodes?

ShibiHe commented 6 years ago

I think I run a few random seeds and compute an average.

TyJK commented 6 years ago

Alright, thanks for the confirmation 👍

TyJK commented 6 years ago

Sorry (I'm doing this for a reproducibility project based partially on your code), but I just wanted to confirm that the --save-pkl parameter allows you to save the weights of the NN. And finally, how would I get the raw scores for the 30-op evaluation method once it's trained?

Thank you once again. I really appreciate it.

ShibiHe commented 6 years ago

I remember I wrote another code specially for doing 30 no-op test. This code base is quite old, so I do not remember the details. My new implementation was using tensorflow to save and load the model.

TyJK commented 6 years ago

Alright, thank you very much :)