Open kevinu3d opened 2 years ago
Good idea.
This here is probably the closest existing code to it: https://github.com/glmcdona/LuxPythonEnvGym/blob/c5d4c161b8aa0c4d5ca7ef08ab1d597be81a7883/luxai2021/env/lux_env.py#L14
And example usage: https://github.com/glmcdona/LuxPythonEnvGym/blob/c5d4c161b8aa0c4d5ca7ef08ab1d597be81a7883/examples/train.py#L114
You can also look at the built-in eval callback: https://github.com/glmcdona/LuxPythonEnvGym/blob/c5d4c161b8aa0c4d5ca7ef08ab1d597be81a7883/examples/train.py#L136
One last thing to look at is this old example code that logged custom game information to the tensorboard. It's was a bit of a hackjob because of the agent having been reset before tensorboard grabs data: PR #86.
When training with different reward functions it's hard to compare 2 bots. A
callback
capable of runningn
games between current agent and another would prove useful to measure progress.I will look into it but if someone knows how to do that help is welcome.