lifrordi / DeepStack-Leduc

Example implementation of the DeepStack algorithm for no-limit Leduc poker
https://www.deepstack.ai/
878 stars 211 forks source link

Comparison to CMU's new Modicum agent #27

Open Kiv opened 6 years ago

Kiv commented 6 years ago

I saw this new paper from CMU with a different way to do depth-limited solving that apparently needs only a small fraction of the resources needed by DeepStack:

https://arxiv.org/pdf/1805.08195.pdf

Anyone else read this paper and have any thoughts? Unfortunately, as is typical for CMU they don't release any source code but it would be very interesting to see a comparison as they discuss DeepStack a lot but don't test against it.

DWingHKL commented 6 years ago

I read the paper, but CMU bot need 10~30s to take an action, it too slow, and Deepstack due to lookahead take action fast.

Kiv commented 6 years ago

This bot also has depth-limited lookahead; on page 7 it says

We conduct depth-limited solving on the first two rounds by solving to the end of that round using MCCFR.

They only solve to the end of the game on the third betting round.

Regarding the speed, on page 6 it says:

"can play in real time at the speed of human professionals (an average of 20 seconds for an entire hand of poker) using just a 4-core CPU"

Where did you understand it needs 10-30s per action? Or are you talking about Libratus?

DWingHKL commented 6 years ago

The number of CFR+ iterations and the amount of time we ran MCCFR varied depending on the size of the pot. For the preflop, we always ran MCCFR for 30 seconds to solve a subgame (though this was rarely done due to caching). On the flop, we ran MCCFR for 10 to 30 seconds depending on the pot size. On the turn, we ran between 150 and 1,000 iterations of our modified form CFR+. On the river, we ran between 300 and 2,000 iterations of our modified form of CFR+. on page 11. It cache the preflop strategy. so preflop can run fast look like table lookup. not each hand go into river, so I think 20 seconds for a hand poker is right.

2018-05-28 0:26 GMT+08:00 Chris M notifications@github.com:

This bot also has depth-limited lookahead; on page 7 it says

We conduct depth-limited solving on the first two rounds by solving to the end of that round using MCCFR.

They only solve to the end of the game on the third betting round.

Regarding the speed, on page 6 it says:

"can play in real time at the speed of human professionals (an average of 20 seconds for an entire hand of poker) using just a 4-core CPU"

Where did you understand it needs 10-30s per action? Or are you talking about Libratus?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lifrordi/DeepStack-Leduc/issues/27#issuecomment-392345068, or mute the thread https://github.com/notifications/unsubscribe-auth/AlgIbe9rpDLjaaZ-1r0wqD_1q_8XXrnbks5t2tPRgaJpZM4UO4kI .

snarb commented 6 years ago

Cool work. I think the best approach is between Modicum and DeepStack. There advantages and drawbacks in both. It is interesting why CMU doesn't test Modicum with Libratus. And why there was no DeepStack vs Libratus battle. Possible because of the risk of bad PR for the loser. Looks like from what I know Libratus will be little stronger.

Kiv commented 6 years ago

Libratus only ran on a specific supercomputer, so probably they can't test it against anything else without getting more grant money.

Does anyone know what were the full results in the 2018 Annual Computer Poker Competition? The paper says that Slumbot won but on their website I didn't see more information.

snarb commented 6 years ago

@Kiv Slumbot in one category, another not so famous project in other. Didn't remember the link.

whatsdis commented 6 years ago

any updates on this?