kaesve / muzero

A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.
MIT License
154 stars 24 forks source link

AlphaZero MemoryLeak on Gym Environments. #2

Open joeryjoery opened 3 years ago

joeryjoery commented 3 years ago

Possibly related to deepcopy of Gym.Env or GymState objects as this issue is absent for the boardgames.

joeryjoery commented 3 years ago

Issue also observed with MuZero agents. Although memory grows much slower than with AlphaZero.