glmcdona / LuxPythonEnvGym

Matching python environment code for Lux AI 2021 Kaggle competition, and a gym interface for RL models.
MIT License
73 stars 38 forks source link

Add game replay saving, add checkpoint callback to save some replays along with the models #99

Closed glmcdona closed 2 years ago

glmcdona commented 2 years ago

By default saves 5 replay matches (itself against itself) every 100K steps during training along side the model. Saved to .\models\.

Adds:

glmcdona commented 2 years ago

Note, I've done more testing and this isn't ready to merge yet. It seems to have some bugs now that I tested more replays, working on preparing a fix now.

nosound2 commented 2 years ago

Great idea with replays. Can be nice to have seed in the replay filename. Will it be possible to reproduce the replay with a manual run? Similarly to how it is reproduced in tests.

glmcdona commented 2 years ago

@nosound2. Fixed the replays to specific seeds. Several fixes to these replays in general.

Found a bug in mismatch of python game engine to real engine and fixed it. This was around allowing locally cities to build units beyond the unit cap - if built on the same turn. Eg, if you have 1 unit, and 2 cities. Both cities could build units on the same turn resulting in 3 units and 2 cities. This is fixed now. This wouldn't have repro'd in the previous replay validation since the replays didn't try to exploit this. This is the only bug I've turned up from examining a bunch of produced replays manually.

I tested the kaggle notebook, and it still runs as expected when pulling from this branch. So everything looks to be backwards compatible.

nosound2 commented 2 years ago

@nosound2. Fixed the replays to specific seeds. Several fixes to these replays in general.

Found a bug in mismatch of python game engine to real engine and fixed it. This was around allowing locally cities to build units beyond the unit cap - if built on the same turn. Eg, if you have 1 unit, and 2 cities. Both cities could build units on the same turn resulting in 3 units and 2 cities. This is fixed now. This wouldn't have repro'd in the previous replay validation since the replays didn't try to exploit this. This is the only bug I've turned up from examining a bunch of produced replays manually.

I tested the kaggle notebook, and it still runs as expected when pulling from this branch. So everything looks to be backwards compatible.

Wow, nice find with the bug.