Closed yc90s closed 4 months ago
Hi @yc90s, thanks for creating an issue.
Unfortunately, PLO5 and PLO6 are not yet supported, as I just knew these rules. I will take time to think about it.
@HenryRLee I'm thinking about adding PLO5 and PLO6 support. Do you have some direction or hint to let me start? Thanks
Hi @JoHuang, glad to see your interests. Do you intent to add PLO5/PLO6 in your own project? Or do you want to create a PR and add the new feature in this repository?
There is a super easy way to implement it, using brute-force. For example, for PLO5, choose 4 cards from the 5 player holes and run our Omaha evaluator, and get the best one. There will be 5 iterations, for PLO5, and 15 iterations for PLO6. I've seen someone implemented PLO5 already, based on our Omaha evaluator. https://github.com/andreihes/phe/blob/main/src/omaha10/eval_OT.c
If it is your private project, and you're happy with this brute-force solution, that would be a very good example to follow.
Well, brute-force is definitely not the fastest solution. We can still use a perfect hash function, so that the time cost of PLO5/PLO6 would be no significantly greater than normal Omaha, while there is a time-space exchange, and the memory cost would be around 5x in PLO5, and 15x in PLO6. And this solution is not easy to implement. We need to write a new evaluator algorithm, which would be similar to the normal Omaha evaluator (https://github.com/HenryRLee/PokerHandEvaluator/blob/master/cpp/src/evaluator_omaha.c). The algorithm is basically running the perfect hash function twice, one for the board cards, one for the hole cards, then concatenate these two hashes, and finally fetch the value from the hash table. After the evaluator algorithm is done, we need to generate the hash values to fill in the hash tables. This part is very time consuming, as there are way too many combinations to cover. (For PLO5, it'd be 52 choose 5 multiply by 47 choose 5: 2598960 * 1533939 = 3,986,646,103,440.) Also we need to write a one-off program to do this job (can be multi-threaded, to speed things up). The final task would be the test code, still very similar to the existing Omaha one (https://github.com/HenryRLee/PokerHandEvaluator/blob/master/cpp/test/evaluation_omaha.cc).
If you are confident with implementing the perfect hash solution, I am happy to give you more guidance, step by step. After that, we can merge your code in this repository if you like. In the same time, I'll add more comments in the Omaha evaluator algorithm, or maybe document it properly in the documentation page.
Hi @HenryRLee
Thanks for detail explain.
My case is like I have a private project with our own evaluator (node.js implement).
I made a node.js binding (https://github.com/JoHuang/PokerHandEvaluator) for our node.js project.
My rough evaluation is that our evaluator PLO4 is slower than yours about 2x~3x. However, our PLO5,PLO6 only have slight overhead.
So 4 iterations of evaluator_omaha.c
might be slower than our evaluator (not experimentally proven, though).
There might be a lot of things we can optimize like binding interface, or caches of card ID without converting from card string every time.
I would like to ask a question: For PLO5, combinations are 8.6x than PLO4, does that mean table generation time also takes 8.6x? (48 hours you said here) Does that mean the memory consumptions 8.6x? (404K * 8.6 ??)
If so, PLO6 would be 60.2x?
Or we can use https://github.com/andreihes/phe/blob/main/src/omaha10/eval_OT.c to speed up table generation?
Hi @JoHuang, I have to clarify that, the memory for PLO4 is roughly 25Mb. I don't know why it was 404k during the measurement. I should probably update the page and make some clarification.
The main memory used in PLO4 is in the noflush_omaha
table declared here (https://github.com/HenryRLee/PokerHandEvaluator/blob/master/cpp/src/tables.h). It has 11238500 elements and each one costs 2 bytes. The number 11238500 is the product of 6175 and 1820. They are the largest possible hash value from the 5-card board and the 4-card hole.
When it goes to PLO5, we have two 5-card hands, makes the table being 6175 * 6175 = 38130625 (76Mb), and plus the other tables it would be between 80Mb to 90Mb.
In PLO6, the largest possible hash value for a 6-card hand is 18394, so the table size would be 18394 * 6175 = 113582950 (227Mb).
It's hard to estimate the time generating the tables. By that time generate PLO4, I had to run Kev's evaluator with 60 iterations. Now that I have my own Omaha evaluator, which is approximately 2x to Kev's 5-card evaluator, and it should speed things up (similar to eval_OT.c
). Each hand may be 5x faster than what I used 3 years ago, but we have 8.6x hands in PLO5.
Of course, everything depends on the computing resource. Actually, how urgent do you need it in your node.js
project? I am considering doing the C/C++ implementation by myself, and using a good AWS instance to compute the hash tables. After that, you can port it to JavaScript.
I'm not that urgent. However, a rough timeline would be very helpful for me to decide my next action. Thanks!
@JoHuang I'll try to do PLO5 in a week, and then use another week for PLO6.
Cool. Then I guess I'll wait for your result.
Hi @JoHuang, just want to share some updates of my progress.
The PLO5 code is done. Also done a lot of testing for it.
However, I made a mistake when running the program calculating the PLO6 table, and cause the instance shutdown unexpectedly. Now I have to do it again. It would take 7 to 8 days to complete, using an AWS c5.9xlarge
instance.
I'll share the benchmark result for PLO5:
2023-06-13T11:18:03+10:00
Running ./benchmark_phevaluator
Run on (12 X 2600 MHz CPU s)
CPU Caches:
L1 Data 32 KiB
L1 Instruction 32 KiB
L2 Unified 256 KiB (x6)
L3 Unified 12288 KiB
Load Average: 1.52, 1.41, 1.36
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
EvaluateAllFiveCards 32854853 ns 32830571 ns 21
EvaluateAllSixCards 298036817 ns 297910500 ns 2
EvaluateAllSevenCards 2182182712 ns 2180793000 ns 1
EvaluateRandomFiveCards 1375 ns 1375 ns 488936
EvaluateRandomSixCards 1607 ns 1606 ns 447608
EvaluateRandomSevenCards 1804 ns 1802 ns 389176
EvaluateRandomOmahaCards 3024 ns 3022 ns 229699
EvaluateRandomPlo5Cards 3290 ns 3288 ns 213043
As you can see, evaluating 100 Omaha (PLO4) hands costs 3024 ns, while the costs for 100 PLO5 hands is 3290 ns. PLO5 is not significantly slower. The memory cost for Omaha is 30Mb, while the memory cost for PLO5 is 112Mb.
I'll also share table generation code. For PLO5: https://gist.github.com/HenryRLee/c1b882a6b5ba5af132e4af66d32eb768 For PLO6: https://gist.github.com/HenryRLee/dbd98e48e13f33fc66c30ac25ba20ac9
@HenryRLee Thanks for the update!
Hi @JoHuang, PLO6 evaluator is ready in the branch feature/plo6
(PR #79). I haven't merged it yet, because Github action has 500Mb memory limit, but compiling the PLO6 table requires more than that. The build and test are good in my local environment. It should be good for you to migrate it to your Node project now.
I'll share the benchmark result with you:
2023-06-23T18:18:51+10:00
Running ./benchmark_phevaluator
Run on (12 X 2600 MHz CPU s)
CPU Caches:
L1 Data 32 KiB
L1 Instruction 32 KiB
L2 Unified 256 KiB (x6)
L3 Unified 12288 KiB
Load Average: 1.20, 1.38, 1.36
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
EvaluateAllFiveCards 33466419 ns 33431524 ns 21
EvaluateAllSixCards 277013289 ns 276743000 ns 2
EvaluateAllSevenCards 2197875932 ns 2196361000 ns 1
EvaluateRandomFiveCards 1368 ns 1367 ns 506450
EvaluateRandomSixCards 1532 ns 1531 ns 435635
EvaluateRandomSevenCards 1823 ns 1821 ns 371615
EvaluateRandomOmahaCards 2950 ns 2949 ns 233322
EvaluateRandomPlo5Cards 3177 ns 3174 ns 218003
EvaluateRandomPlo6Cards 3236 ns 3233 ns 215227
The memory cost for PLO6 is around 352Mb.
Hi @JoHuang, I merged the PLO6 PR.
I excluded PLO6 in Github action, because compiling the table hits the 500Mb memory limit in Github.
However, everyone can still checkout the source code and build it in their local machine, and run the unit test and benchmark. Additional cmake
option is required in order to do so. For example:
cmake -DBUILD_PLO5=ON -DBUILD_PLO6=ON ..
Thanks! I'll check it.
how to evaluator PLO5 and PLO6