Update example training agent and script

glmcdona / LuxPythonEnvGym

Matching python environment code for Lux AI 2021 Kaggle competition, and a gym interface for RL models.

MIT License

73 stars 38 forks source link

Update example training agent and script #86

Closed glmcdona closed 2 years ago

glmcdona commented 2 years ago

Updates:

Split unit and city actions.
Update reward function to a better example that is a delta of reward and scaled reasonably. Fixes issue #83.
Set default example agent to inference on non-deterministic mode. Agents often get stuck when set to deterministic in inference.
Fix bug in unit maps where it wouldn't track nearest unit correctly.
Add multi-environment training command-line arg.
Add multi-environment evaluation metrics logging.
Add single-environment tensorboard game internal metrics logging.

royerk commented 2 years ago

Please let me know if you would like me (us?) to also run this update. Looks great, looking forward to have feedback on multi-envs training :heart:

nosound2 commented 2 years ago

Hi @glmcdona , just a small remark from me. Many of the changes in this pull request seem to me already too deep implementation details, the kind of things everyone should decide for himself. I would have kept the repo cleaner than that. However, I am not familiar with the repository philosophy, just a first thought.

[UPD] the new tensorboard logging stats are cool

glmcdona commented 2 years ago

Hi @glmcdona , just a small remark from me. Many of the changes in this pull request seem to me already too deep implementation details, the kind of things everyone should decide for himself. I would have kept the repo cleaner than that. However, I am not familiar with the repository philosophy, just a first thought.

[UPD] the new tensorboard logging stats are cool

Totally agree. Clean and simple for the example I think makes sense. Leave more advanced implementation like this for something like a separate public notebook using the framework. TODO: Refactor the example agent and training script to more minimal.