JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.08k stars 410 forks source link

Misuse of `Base.minimum` and `Base.maximum`? #1824

Open timholy opened 5 months ago

timholy commented 5 months ago

I'd argue that using minimum and maximum to refer to the domain of a distribution is a misnomer. One can come close to justifying it: you can think of a distribution as a collection of pairs x => p, where x is a value in the domain and p is the corresponding probability. We do support dictionary ordering of pairs, and so this works:

julia> dist = Set([0 => 0.5, 1 => 0.25, 2 => 0.25])   # a representation of a distribution
Set{Pair{Int64, Float64}} with 3 elements:
  2 => 0.25
  1 => 0.25
  0 => 0.5

julia> minimum(dist)
0 => 0.5

julia> maximum(dist)
2 => 0.25

However, one notes that the returned value should be a pair, and not a value in the domain. For it to return a value in the domain, it would have to be reasonable to think of a distribution as a collection of all the points in the domain, and that obviously misses half of what distributions are all about.

I'd argue instead that we define Domain(d) and then minimum(Domain(d)) would make sense. We could also define Range(d) analogously. I'm not sure these names are best, though, they are pretty generic (maybe they would have to be qualified). Alternatively, minimum(first, d) might be justifiable, but obscure.