Closed keithgw closed 10 months ago
When BotPlayer is initiated, it can either follow a uniform random policy, which is the only policy currently implemented, or the new MCTS policy. Later MCTS policy can be parameterized by its reward type and explore/exploit strategy.
When BotPlayer is initiated, it can either follow a uniform random policy, which is the only policy currently implemented, or the new MCTS policy. Later MCTS policy can be parameterized by its reward type and explore/exploit strategy.