ram6ler / python-trotter

A Python library for working with structures that commonly come up in combinatorics, such as permutations, combinations and subsets.
MIT License
2 stars 0 forks source link

Overflows checking len() of large pseudo-lists #1

Closed salian closed 2 years ago

salian commented 2 years ago
import Trotter
import secrets
import sys

base58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
my_amalgams = trotter.Amalgams(11, base58)
print(my_amalgams)
A pseudo-list containing 24986644000165537792 11-amalgams of '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'.

But if I try to use the very long my_amalgams

random_pick = secrets.choice(my_amalgams)
OverflowError: cannot fit 'int' into an index-sized integer

Same thing directly using len()

list_length = len(my_amalgams)
OverflowError: cannot fit 'int' into an index-sized integer

I'm using Python 3.10 and sys.maxsize is 9223372036854775807

ram6ler commented 2 years ago

Hi salian,

Thanks for your interest in this library.

Actually, there is a Combinatoric method called random that generates a random element. So you could replace

random_pick = secrets.choice(my_amalgams)

with

random_pick = my_amalgams.random()

Admittedly, this uses the random package in the background. If you really want to use secrets, you could do something like the following, which is based on what happens in the background when the method above is called.

def random_using_secrets(combinatoric: trotter.Combinatoric) -> list | str:
    def random_index() -> int:
        # Uses Fast Dice Roller algorithm.
        # See Lumbroso, J. (2012)
        #   Optimal Discrete Uniform Generation from Coin Flips, and Applications.
        n = combinatoric._length
        v, c = 1, 0
        while True:
            v, c = 2 * v, 2 * c + secrets.randbelow(2)
            if v >= n:
                if c < n:
                    return c
                else:
                    v -= n
                    c -= n

    return combinatoric[random_index()]

random_pick = random_using_secrets(my_amalgams)

Unfortunately, Python complains with excessively large indices, and there's not much I can do about that. As a workaround, though, instead of using

len(my_amalgams)

use

my_amalgams._length

Hope that helps!