JuliaCollections / DataStructures.jl

Julia implementation of Data structures
https://juliacollections.github.io/DataStructures.jl/latest/
MIT License
692 stars 246 forks source link

rand() for multisets #706

Open hwjsnc opened 3 years ago

hwjsnc commented 3 years ago

This package does not currently define Random.rand(::Accumulator) so the default implementation for an AbstractDictionary is used, returning a random entry x=>count of the underlying dictionary. For example, rand(DataStructures.Accumulator(Dict(2=>4))) returns 2 => 4.

In my opinion, rand should return one of the values/keys with a probability proportional to its corresponding count.

If the maintainers agree, I can implement this and prepare a pull request.

Counterpoints: This only makes sense if all counts are nonnegative and finite. The implementation would either have to store the total count and update it for every modification, or recompute it every time rand is called. Either choice will be inappropriate for some applications.

oxinabox commented 3 years ago

That seems like the correct behavour for a generalized multiset, yes. And accumulator does currently act as a generalized multiset (i.e a multiset where some items may have fractional count).

There is currently a open PR looking at splitting the multiset behavour into a seperate type.

691

hwjsnc commented 3 years ago

Accumulators hold negative values as well (intentionally, as far as I can tell), so this change does seem better suited for a dedicated multiset type.

oxinabox commented 3 years ago

We have a negative multiplicity error that we throw when you try a multiset operation on a Accumulator with negative values

hwjsnc commented 3 years ago

This is only true for the multiset operations, right? The basic functions like inc! and dec! seem to allow negative values, and I imagine this to be useful sometimes. So I suppose random sampling for accumulators makes no sense in general, and should be reserved for the proper multiset type added with the previously mentioned pull request. I'll change the title accordingly.