Closed DoxakisCh closed 3 years ago
In the current version, you can already choose to display in real time the performance of muzero against a random agent, itself, or a hard coded expert agent with the opponent
parameter. The results are displayed in tensorboard under the name 'MuZero reward' and 'opponent reward'.
If you want more help for adding custom games you can join the discord.
Hi,
I have created a poker envirnoment and I want to train an agent with this implementation. In these occasions, the training proccess can take really long and also the training via self-play does not show any clear signs about the agent's performance. For these reasons, I was considering testing the agent at regular intervals against a random or a different trained agent and show the results in tensorboard for better monitoring. Because in your implementation the training process is continuous, is there way to apply this kind of evaluation?
Thank you in advance.