Open SteveDraper opened 9 years ago
In TicTacToe large we show approx 44% as the expected value, even though we trivially draw it. How can random playouts and initial MCTS expansion converge to a role-asymmetric result like this??
Confirming what you saw...
In TicTacToe large we show approx 44% as the expected value, even though we trivially draw it. How can random playouts and initial MCTS expansion converge to a role-asymmetric result like this??