diku-dk / openbanko

:older_woman: :goberserk: :raising_hand: Free software tools for working with Big Banko :older_man: :baby: :family:
BSD 2-Clause "Simplified" License
1 stars 1 forks source link

Perfect compression of sets of banko cards #9

Open nqpz opened 6 years ago

nqpz commented 6 years ago

There are 3669688706217187500 banko cards. Say we have a set of 10000000 banko cards. How many sets of this kind exist? We can use the binomial coefficient to find this number. Then we can just compress the set of cards into an index into a virtual array of sets. However, this is a very large number, and I didn't actually succeed in just calculating it. Also, I don't think a practical implementation is doable.

But this also generalizes into our current perfect-for-one-card compression, as binom(n, 1) = n.

athas commented 6 years ago

This may actually be feasible. Let N = 3669688706217187500. Now we need to compute

binom(N, k) = (N!)/((N-k)!*k!)

But what we really want is:

log2(binom(N, k)) = log2((N!)/((N-k)!*k!))
                  = log2(N!) - log2((N-k)!*k!)
                  = log2(N!) - log2((N-k)!) - log2(k!)

Now we can exploit

log2(n!) = log2(n) + log2(n-1) + ... log2(1)

But that is still slow because N is huge.

So now I am reading this blog post, which seems to suggest a better way of approximating huge factorials: https://www.johndcook.com/blog/2010/08/16/how-to-compute-log-factorial/

So I think this is a feasible way of encoding the set, but I have not worked it out yet.

athas commented 6 years ago

Looks like this is the formula. For your example, you would need about 4500000 bits, or 548KiB. Our best algorithm (7) currently uses 766KiB to encode 100k sorted boards.