lifrordi / DeepStack-Leduc

Example implementation of the DeepStack algorithm for no-limit Leduc poker
https://www.deepstack.ai/
891 stars 211 forks source link

bug in lookahead? #12

Closed bobbiesbob closed 6 years ago

bobbiesbob commented 6 years ago

https://github.com/lifrordi/DeepStack-Leduc/blob/da416f9646725def43e668851593de13ead8b607/Source/Lookahead/lookahead.lua#L209

Why are you swapping nn outputs if current player is 1? The inputs are only swapped if current player is 2.

lifrordi commented 6 years ago

Not a bug, but I agree this is a bit confusing. Here's why

1) as you state, the outputs are always swapped - that's because the solver needs values (counterfactual values) in the different order than ranges. This is super confusing but makes some stuff run faster the way it's written. There's no other reason.

2) the inputs for the players are swapped based on what player is to act first in the 'subgame'/'neural net'. This is because the order matters - the input ranges for the net are (range1, range2) and the player with range1 gets to act first in the subgame. If the player who is to act first the player 1, no need to swap anything. If player 2, we need to swap the inputs.

bobbiesbob commented 6 years ago

I understand now. Thank you for the clarification!