lifrordi / DeepStack-Leduc

Example implementation of the DeepStack algorithm for no-limit Leduc poker
https://www.deepstack.ai/
878 stars 211 forks source link

Special treatment of all-in actions in lookahead? #20

Closed Kiv closed 6 years ago

Kiv commented 6 years ago

I'm trying to understand the lookahead code in order to apply it to a non-poker game, and I was hoping you could clarify why all-in actions are treated specially in the code compared to regular bets?

For example there are variables like nonallinbets_count, allin_nodes_count, but how would those be used in the general case? Is it nodes that have children but no grandchildren?

bobbiesbob commented 6 years ago

I believe it is for terminal equity calculation. Since each situation will be for a specific public board, the terminal equity needs to be recalculated for all-in situations. The all-in actions are conveniently placed as the last action choice at all nodes so they can be batch-accessed with the "-1" index, maximizing parallelism when using GPU to calculate terminal equities.

Kiv commented 6 years ago

Thanks. I understand why the all-ins are placed at the last action choice, but don't both all-ins and regular bets use the same call matrix for terminal equity?

In other words, I don't see why the algorithm needs to separately count all-ins.

bobbiesbob commented 6 years ago

Because they don't need to calculate terminal equity for non-allin nodes. They will only calculate terminal equity for the nodes where they are necessary. If the node is the last action of the street but not terminal, they will instead use the next_round_value to calculate the "fake" terminal equity. If the node is not terminal and also not the last action of a street, they will not need terminal equity or fake terminal equity.

Kiv commented 6 years ago

Got it. I was running test_lookahead.lua that only deals with the last street and doesn't use the neural network, so I got confused.

So in general, we'd have to divide up all the nodes into separate categories for (a) not terminal (b) not terminal but depth limited, and then as many categories as necessary for terminal nodes (only fold and allin for poker, but potentially many more in more complicated games)

lifrordi commented 6 years ago

Yes, that's just our implementation though, you can implement it however you like