JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.08k stars 410 forks source link

Inconsistent Type Behavior for Uniform Distribution #1783

Open arthur-bizzi opened 9 months ago

arthur-bizzi commented 9 months ago

Hey. Not only do Distributions still behave differently with regards to types when generating one or multiple samples ( #1252 ), but the behavior is also inconsistent between different types:

using Random, Distributions 

UBig = Uniform(BigFloat(0),BigFloat(1)) #Uniform{BigFloat}(a=0.0, b=1.0)
rand(UBig) #BigFloat
rand(UBig,2) #Vector{Float64}

U32 = Uniform(Float32(0),Float32(1)) #Uniform{Float32}(a=0.0f0, b=1.0f0)
rand(U32) #Float64
rand(U32,2) #Vector{Float64}

This follows from type promotion rules. An attempted solution would be something along the lines of:

import Base.rand
rand(rng::AbstractRNG, d::Uniform{T}) where T<:Real = d.a + (d.b - d.a) * rand(rng,T)

But the type of single numbers would still differ from that of arrays:

rand(U32) #Float32
rand(U32,2) #Vector{Float64}

On the other hand, rand! seems to ignore the type parameter completely:

v = rand(Float32,4)
rand!(UBig,v) #Float32

Is there a mental model for this behavior? If so, wouldn't it be a good idea to include it in the docs? They currently contain no references whatsoever to type precision.

devmotion commented 9 months ago

The main reason for the current behaviour is rooted in the historical development process. Initially, only Float64 was supported and was hardcoded in many places. Over time these restrictions were lifted in some places but thereby consistency was lost. There is no clear mental model, unfortunately, but I think currently you should always expect samples of contonuous distributions to be of type Float64 - when promotion happens, it's an unintended consequence of generalizing types.

There is a strong incentive though to fix these issues soonish. Therefore I think it's not worth documenting the current buggy behaviour.

arthur-bizzi commented 9 months ago

I see, thank you very much. Can you think of any relatively quick fixes I could implement on my end until these changes are merged? More specifically: could I get the fix I mentioned above working for vectors? At least for basic univariate distributions.

jondeuce commented 9 months ago

Just ran into this same issue. For the case of Uniform, one issue stems from this line which should use rand(rng, T) instead of rand(rng). Another issue stems from the fact that eltype seems to always return Float64, since eltype(::Type{Uniform{T}}) where {T} = T is not defined (like, for example, it is for Normal). This causes this line to preallocate an Array{Float64}.

If I should expect Float64 in general I can work around that for now, but it was unexpected.

stevengj commented 8 months ago

For the case of Uniform, one issue stems from this line which should use rand(rng, T) instead of rand(rng).

Probably should be rand(rng, float(real(T))) to ensure that it is a real floating-point type. (There's also the issue that, for dimensional types, it should strip off the dimensions. See also the discussion in https://github.com/JuliaMath/QuadGK.jl/pull/92)

This issue recently came up on discourse.