Farama-Foundation / MicroRTS-Py

A simple and highly efficient RTS-game-inspired environment for reinforcement learning (formerly Gym-MicroRTS)

MIT License

233 stars 44 forks source link

Faster Convergence #51

Open vwxyzjn opened 2 years ago

vwxyzjn commented 2 years ago

Training an agent now still takes a long time. The particular experiment in #36 took 4d 9h 11m 14s to finish.

Looking at the reward chart, it appears the agent could achieve 70% of the final performance in just 50M steps (or about 10 hours into training)

We should try to optimize based on the 10 hours time computational budget.

vwxyzjn commented 2 years ago

The bottleneck I think is still largely on the NN side. So one thing worth trying is to reduce the NN size.

Alternatively, I noticed the learning rate annealing, in the end, seems to really help the algorithm converge. So maybe we could also try using a smaller learning rate and just turn off annealing.

Maybe we could tune with the discount factor (we should also visualize the discounted returns (what the agent actually optimized for).

5196CF77-ED9B-43F5-AEF2-C1601A4AAEBC

vwxyzjn commented 2 years ago

56 tries to address this issue.