lifrordi / DeepStack-Leduc

Example implementation of the DeepStack algorithm for no-limit Leduc poker
https://www.deepstack.ai/
891 stars 211 forks source link

sorted range generation, and dynamic bucketing #24

Closed JaysenStark closed 6 years ago

JaysenStark commented 6 years ago

@lifrordi I am implementing Deepstack on two player no-limit texas holdem. Since the deepstack-leduc is a simple demonstration, it simplfied the bucketing process and sorted_range generation process. These are two major aspect that I feel confused about. In the paper, there are little details about these two details. My question is: Q1: in order to make ranges(randomly generated) looks more like real game situation distribution, before we assign probability(generated by the code showed in the paper) to those possible hands, we need to sort possible 2 card hole(1326 type at most) according to board cards(use hand strength), i suppose that EHS are used in the sorting process. Is my thought reasonable? What kind of measure did you use in sorting process? Q2: after random poker situations are generated, we use public cfr+ to solve the random poker situation. After that we need to convert card range to bucket range(can be regard as card abstraction), what is the measurement of abstraction, Hand Strenght weighted by opponent range? or EHS(assume opponent hole uniform random)?

Kiv commented 6 years ago

I'm not @lifordi but I found a supplemental paper with some more details that might help:

https://static1.squarespace.com/static/58a75073e6f2e1c1d5b36630/t/58bed28de3df287015e43277/1488900766618/DeepStackSupplement.pdf

On page 11 there is a section "Neural Network Range Representation" that says earth movers's distance is used, so in my understanding, EHS is not used at all.

JaysenStark commented 6 years ago

@Kiv Thanks for your reply. If you want to compute distance using EMD, you must have a vector of hand strengths as input, The difference between these two method is that EHS for a hand is a scalar, the EMD distance computaion method's input the the distribution of one hand' HS over equity intervals. (it look like a Histogram, if we do weighted sum, if would become EHS, but EMD is more accurate)The problem is whether opponent range used or not in the process of hand strength computaion. EMD distance computation is slow, I thought they might no did that as they mention in the paper.

lifrordi commented 6 years ago

@JaysenStark

Q1: yes, we used EHS in sorting process

Q2: we used buckets based on EMD, for example described in http://poker.cs.ualberta.ca/publications/AAMAS13-abstraction.pdf

JaysenStark commented 6 years ago

@lifrordi Thanks.