lifrordi / DeepStack-Leduc

Example implementation of the DeepStack algorithm for no-limit Leduc poker
https://www.deepstack.ai/
891 stars 211 forks source link

Question about reach probability initialization (line 185 of Source/Tree/tree_values.lua) #36

Open Florencu opened 5 years ago

Florencu commented 5 years ago

Before I begin, let me note that this might very well be a misunderstanding on my side, since I haven't looked through the full codebase! I just want to make sure that no bug has sneaked into your or my implementation.

I implemented something that looks similar to your /Tree/ module in Python, however, my implementation doesn't use the matrix ops yours uses. To compute the cfv for p when 1-p folds, for instance, I run (equation simplified and ignoring blockers to keep focus): cfv[p] = node.pot * np.sum(node.reach_probs[1-p]) - node.reach_probs[1-p]. I believe, this might produce wrong results if reach probabilities on the root are initialized as you do in line 185 of Source/Tree/tree_values.lua. You initialize the reach probability of each holding with 1/num_holdings. Using my equation for the cfv of folding, this will cause all values on the tree to be scaled down by (num_holdings-1)/num_holdings. Taking the thought experiment of the player who acts first on the first round in Leduc, always folding; his opponent still has to have some holding, thereby reducing the sum of the first player's reach probabilities to 5/6. Do your equations correct for that somewhere? I'd very much appreciate if you could clarify this for me.

Cheers