Open sobolevn opened 3 months ago
For that there's the prng package https://hex.pm/packages/prng
Maybe prng could have random_range and choice added?
No plans for 1 and 2 at present. 3 sounds good 👍
hm the gleam-@alias idea reappears for me where we could document and help people find functions when they come from other stdlibs such as pythons, phps and javascripts.
Isn't this basically list.shuffle |> list.take(n)
?
I suppose this can be made more efficient than shuffling the full list, if we have a large list and only want to pick a few random elements, but it could be a good first version.
shuffle
runs a fold
generating random numbers for each element of a list, then sort
it and iterates over all elements to remove the generated random numbers. take
runs in linear time.
A possible improvement would be to iterate over only the first n
elements after the sort, which would avoid one full iteration of the list.
A possible version without the sort would generate a new random int every time, that is at most the length of the list, then pop the element at that position into an accumulator and repeat until it has n elements (or the list is empty).
There are algorithms for random sampling from a linked list. I haven't done the research to say how each approach compares.
Reservoir sampling would be the obvious choice but the implementations I'm familiar with use indexing into arrays which wouldn't work here since it would have to work with lists instead. I guess you could use a dictionary instead of an array as the reservoir but there's probably a better approach out there.
Oops, misclick.
How about we copy whatever Elixir or Elm or some other similar language does?
Good point. It's worth taking a look at what other languages do in this area.
The Elixir standard library has the Enum.take_random
method which now uses a modified reservoir sampling algorithm for performance reasons (relevant commit). Internally it uses a tuple as the reservoir in place of the traditional fixed length array.
I looked at the Elm and couldn't find any relevant methods in the standard library or packages.
A few Haskell libraries had implemented versions of it which used things like IntMap
as the reservoir data structure.
None of this really helps us but it's still interesting.
Are y'all wanting a multiple sample like python's random.sample(List, Int)
or just a single element sample like random.choice(List)
?
We want a function which takes a specified number of values from a list.
I was looking at impls, it seems like algo L seems optimal (and i could be wrong but seems similar to what elixer does). However, it requires taking the natural log which I couldnt find impl in the stdlib, whats the ideal solution? A slower impl without needing natural log or adding natural log to gleam/float?
Sounds like a good reason to add that function to me! Unless anyone else has any other suggestions.
It looks like that algorithm wants array mutation at a random index. How would you do that in Gleam given we don't have constant time indexing or array mutation.
I started looking at the state of random in Gleam and I have several ideas.
int.random
andfloat.random
do not have seeds. It might be a problem for libraries. For example: if you want a fake-data library for tests: you need to have the same data based on some test seed value. Otherwise, you won't have reproducable failures. https://docs.python.org/3/library/random.html#notes-on-reproducibilityint.random_range(x, y)
function, which would generate an int from range>=x, <y
. I think that this function is essentialchoice(List)
function, which in my experience is the second most frequently used random function in Python.Maybe we should add
random
module with the needed functions and a nice api forSeed(Int)
?