Open wouterwln opened 1 month ago
This is problematic here
julia> Gamma(1f0, 1f0) |> rand
1.1458127367196822
julia> Gamma(2f0, 1f0) |> rand
1.662515f0
Please search the issue tracker for several similar issues. I think this issue might be redundant.
Unfortunately it is not. It was introduced in https://github.com/JuliaStats/Distributions.jl/commit/3946acc6901b5126022ae22564ab9aa7d913ed1e which was committed last month. We found this in our downstream test pipeline, so unfortunately this is a new issue.
Which version of Distributions are you using? On master I'm getting
julia> typeof(rand(Exponential(θ))) == typeof(θ)
true
and that is a consequence of the commit that you link to (which I had forgotten).
You are right, on master this seems to be fixed. Reading the discussion in https://github.com/JuliaStats/Distributions.jl/pull/1885 improved my understanding. The issue that is found by our tests is that eltype(Gamma(1.0f0, 1.0f0))
is not Float32
(for reference, for Normal
this returns Float32
). The story here implies that eltype
should return the type of the samples. Is that correct? If this is not the case, the problem is downstream and we have to make an exception downstream.
Edit: in the documentation it says the following:
eltype(::Type{Sampleable})
The default element type of a sample. This is the type of elements of the
samples generated by the rand method. However, one can provide an array of
different element types to store the samples using rand!.
Indeed for Gamma
eltype fallbacks to
Base.eltype(::Type{<:Sampleable{F,Continuous}}) where {F} = Float64
and for Normal
eltype is defined as
Base.eltype(::Type{Normal{T}}) where {T} = T
in our code base we basically implicitly tested that eltype(distribution) == typeof(rand(distribution))
and it failed. I would suggest to add those tests in Distributions.jl as well because now the implementation for Gamma
does not follow the documented behaviour of eltype(::Type{Sampleable})
. Specifically this line This is the type of elements of the
samples generated by the rand method. which is not for Gamma
(and maybe other distributions)
julia> eltype(Gamma(1f0, 1f0)) == typeof(rand(Gamma(1f0, 1f0)))
false
Also this behaviour leads to this inconsistency, where semantically similar code returns different container types
julia> rand(Gamma(1f0, 1f0), 1)
1-element Vector{Float64}:
1.2745195627212524
julia> [ rand(Gamma(1f0, 1f0)) ]
1-element Vector{Float32}:
3.234097
in our code base we basically implicitly tested that
eltype(distribution) == typeof(rand(distribution))
I think it's best to avoid eltype(::Distribution)
completely. Please refer to the many long discussion in the issue tracker. As an example of the issue here consider the case
julia> eltype(Normal(1//1, 1//1))
Rational{Int64}
julia> rand(Normal(1//1, 1//1))
0.3227598879915683
so eltype(::Normal)
is just lying to you.
You cannot expect users to refer to long discussion in the issue tracker. The issue tracker is for developers, not for users. Users are looking at the documentation of the package. And the documentation is misleading. If the documentation is lying it must be changed.
As a side note, obtaining the exact sample type without having to call eltype(rand(...))
is indeed a highly useful feature for preallocating containers, as the documentation suggests. However, it seems like this functionality is currently somewhat broken. For example, even with your code:
julia> distribution = Normal(1//1, 1//1)
Normal{Rational{Int64}}(μ=1//1, σ=1//1)
julia> container = zeros(eltype(distribution), 10);
julia> Distributions.rand!(distribution, container)
ERROR: MethodError: no method matching randn(::Random.TaskLocalRNG, ::Type{Rational{Int64}})
This issue makes it challenging to write generic code that preallocates containers for samples from arbitrary distributions, as eltype
doesn’t seem to be designed for this purpose and is "lying". Is there an alternative method to handle this?
My point is that this is an open issue and part of the discussion here is just repeating arguments from other issues.
obtaining the exact sample type without having to call
eltype(rand(...))
is indeed a highly useful feature for preallocating containers
This is the eternal challenge. There is no easy way to know this type. You can
When you try to write generic code then 3 almost always ends up being wrong at some point. Part of the issue here is that Distributions was original written just for Float64 and Int64 parameters and random variates and then later made generic which invalidated some of the original assumptions. But now I'm repeating explanations already available in other issues.
I understand your point, but the documentation is a primary source of information for most users (and not GitHub issues) and seems to be incorrect in its current form. If the previous original assumptions were invalidated it should be reflected in the documentation too. Now the documentation states that eltype
returns the type of the samples, which isn't accurate. Would it be possible to at the very least just update the documentation? This will certainly prevent confusion and the need for repeated explanations in other issues.
MWE:
returns
false
Because Exponential sampling is also used when sampling
Gamma
with shape 1, the problem also occurs there. ForGamma
this introduces a type instability, since the return type for sampling for aGamma
depends on whether or not the alpha parameter is equal to1.0
.