charleskawczynski / PokerHandEvaluator.jl

A Poker hand evaluator, with Texas hold 'em in mind.
MIT License
6 stars 0 forks source link

Tuple causes too much slow down #27

Closed Moelf closed 3 years ago

Moelf commented 3 years ago

I'm playing with some toy Monte Carlo and realized the tuple interface makes it extremely slow, the best I can do for now is:

const AC = full_deck()
function one_round(N=7)
    _deck = Deck(copy(AC))
    shuffle!(_deck)
    _table = pop!(_deck, 5)
    my_rank = hand_rank(CompactHandEval(pop!(_deck,2), _table))
end

julia> @btime one_round()
  56.815 ms (85673 allocations: 5.09 MiB)

the copy(AC) makes it 10x faster than invoking ordered_deck, as expected, but I can't figure out a way to make hand_rand(CHE) faster because Tuples are immutable thus memory reuse pattern fails.

charleskawczynski commented 3 years ago

I think that this slow down is from allocations for the deck (and maybe cards, too). Here is an example that uses the tuple interface with fewer allocations and better performance:

julia> using BenchmarkTools, Combinatorics, PlayingCards, PokerHandEvaluator

julia> const all_possible_hands = Tuple.(collect(combinations(full_deck(), 5)));

julia> sample_hand() = all_possible_hands[rand(1:2598960)]
sample_hand (generic function with 1 method)

julia> function one_round(N=7)
           my_rank = hand_rank(CompactHandEval(sample_hand()))
       end
one_round (generic function with 2 methods)

julia> @btime one_round()
  1.989 μs (8 allocations: 149 bytes)
6462

That said, this example is perhaps a bit awkward, and I think the original post is a bit more logical. It may be that we just need a lazy way to deal cards / get a set of N unique cards from the deck. I'm inclined to leave this issue open until we have a nice interface to work with.

charleskawczynski commented 3 years ago

Actually, my example is not equivalent (the OP uses the 7 card interface), I’ll try to fix it up a bit later

Moelf commented 3 years ago

ideally, (without knowing how exactly does evaluate_7 works), one should be able to do something like:

seq = sample(1:52, 7; replace=false)
AC = full_deck()
rank_7(@view AC[seq])

this should have minimal allocation as we always view into the deck. of course, it is entirely possible that by spending 10x more memory I/O it can actually be faster, but i personally feel that's unlikely.

charleskawczynski commented 3 years ago

See this comment