Open JohanMollevik opened 5 years ago
Good catch. That's a very poor way to do this. I'm replacing it with:
return np.random.randint(0, self.len, self.n).tolist()
This supports cases until 10^20 at which point int64 overflows. So I guess it's nice if we can say we support permutation spaces until 10^20.
It seems the implementation could be replaced by using pythons built in random generation rather than numpys as python has optimized its implementation for ranges
import random ... return random.sample(range(self.len), k=self.n)
Are there ay problems with making a change like that?
Does the randint allows duplicates or not?
It indeed does. So let's use what you have proposed.
EDIT: correction to before >> here the constrain with overflow is at 10^18 so that's the supported limit.
That will have to do
If 10^18 proves limiting it would probably work well to add a fallback where if the requested number is bigger than that it selects random numbers in a loop until it has enough without duplicates.
The neat thing is that in this case with huge numbers the risk of selecting the same number several times in a row is vanishingly small as the range of the numbers are huge compared to the number of numbers requested.
This will make the fallback take linear time in the common case instead of running the risk of not terminating which that approach can do with large fill factors.
This limit ended up being a problem for me, pull request https://github.com/autonomio/chances/pull/3
The CI failed, it seems to be complaining about unit test coverage, I will try to resolve that.
CI still fails but the error seems unrelated to this and looks more like a dependency problem, can you look and see if you see a cause?
uniform_mersenne is using memory proportional to the range of numbers it generates
If asked to generate 3 numbers between 0 and 1000000000 the memmory allocated will be for an array 1000000000 big.
the culprint is the line
out = list(range(self.len))
which happily allocates gigabytes of memmory even when only a few numbers are requested