For games where evaluation takes a lot of time (particularly those that will just time out), a lot of time is wasted waiting for the evaluation to finish before the agent can resume training. It would be better to have a queue that takes a copy of the model at the evaluation timepoint and then evaluates the model in parallel with the training. Care needs to be taken to make sure the queue doesn't fill up with models (causing memory to run out) or many models are evaluated at once (probably best to limit to one evaluation at a time).
For games where evaluation takes a lot of time (particularly those that will just time out), a lot of time is wasted waiting for the evaluation to finish before the agent can resume training. It would be better to have a queue that takes a copy of the model at the evaluation timepoint and then evaluates the model in parallel with the training. Care needs to be taken to make sure the queue doesn't fill up with models (causing memory to run out) or many models are evaluated at once (probably best to limit to one evaluation at a time).