"I'm on the fence about just removing fold from the tree... If we do this, we'd have to implement some kind of minimum utility threshold for folding, where if, after all simulations have been run, all available options have utility scores below the threshold, we just fold instead. The immediate problem with that is apparently MCTS chooses the action with the most simulations to choose, NOT the highest utility??? we could try ours out by choosing utility and going about it that way... but I have a feeling that just leaving fold in, and using a net chip gain approach to utility would be better in the long run, and more intuitive to code. This way folding will consistently generate a small net loss, while other options that could be bad for us in a given spot would generate larger net losses. Ideally, if we have a bad hand, all options would generate net chip losses, and fold would generate the smallest chip loss."
The current wins/ties/losses approach was easy to implement at first, but is probably not a good utility indicator in the long run. Something like net chip gain (loss would be negative), or sum of sqrt(chip gain) (or -sqrt(chip loss) for negatives) could be good utility measures. I like the sqrt idea, because in theory it reduces the impact of outliers on the utility score.
Anyway, this change needs to happen at some point, and I wanted to make that concrete.
As mentioned in commit 4e20672,
Anyway, this change needs to happen at some point, and I wanted to make that concrete.