Before I begin, let me note that this might very well be a misunderstanding on my side, since I haven't looked through the full codebase! I just want to make sure that no bug has sneaked into your or my implementation.
I implemented something that looks similar to your /Tree/ module in Python, however, my implementation doesn't use the matrix ops yours uses. To compute the cfv for p when 1-p folds, for instance, I run (equation simplified and ignoring blockers to keep focus): cfv[p] = node.pot * np.sum(node.reach_probs[1-p]) - node.reach_probs[1-p]. I believe, this might produce wrong results if reach probabilities on the root are initialized as you do in line 185 of Source/Tree/tree_values.lua. You initialize the reach probability of each holding with 1/num_holdings. Using my equation for the cfv of folding, this will cause all values on the tree to be scaled down by (num_holdings-1)/num_holdings. Taking the thought experiment of the player who acts first on the first round in Leduc, always folding; his opponent still has to have some holding, thereby reducing the sum of the first player's reach probabilities to 5/6. Do your equations correct for that somewhere? I'd very much appreciate if you could clarify this for me.
Before I begin, let me note that this might very well be a misunderstanding on my side, since I haven't looked through the full codebase! I just want to make sure that no bug has sneaked into your or my implementation.
I implemented something that looks similar to your
/Tree/
module in Python, however, my implementation doesn't use the matrix ops yours uses. To compute the cfv for p when 1-p folds, for instance, I run (equation simplified and ignoring blockers to keep focus):cfv[p] = node.pot * np.sum(node.reach_probs[1-p]) - node.reach_probs[1-p]
. I believe, this might produce wrong results if reach probabilities on the root are initialized as you do inline 185
ofSource/Tree/tree_values.lua
. You initialize the reach probability of each holding with 1/num_holdings. Using my equation for the cfv of folding, this will cause all values on the tree to be scaled down by (num_holdings-1)/num_holdings. Taking the thought experiment of the player who acts first on the first round in Leduc, always folding; his opponent still has to have some holding, thereby reducing the sum of the first player's reach probabilities to 5/6. Do your equations correct for that somewhere? I'd very much appreciate if you could clarify this for me.Cheers