Open Red-Portal opened 1 month ago
Indeed, this seems to be a bug. randn(rng)
should probably be changed to randn(rng, typeof(z))
.
Okay let me pick this up.
Hi @devmotion I opened a PR, could you take a look when you have time?
Has the situation from #1071, #1666, etc. changed?
I mean, personally, I'm not at all against the idea that eltype(d) should equal partype(d) when {eltype(d), partype(d)} <: AbstractFloat
.
Although there should probably be a better interface that can generalize across more distributions.
I'm not at all against the idea that eltype(d) should equal partype(d)
That's absolutely not the idea here. In the case of TDist
, we get a sample from ChiSq (of some type, regardless of whether it follows any conventions or not). And IMO regardless of what type it is, it seems wrong to modify/promote its type by a randn
call (basically a "Float64 literal").
Hi @devmotion , unfortunately, it seems that calls to rand(q, d)
still result in the wrong type. I guess this is because eltype(q)
returns Float64
. Should I make a PR that defines eltype(q) = float(partype(q))
?
One option within the current interface, if you need more than one random value, is to call the rand!
method on a vector of the type you want, e.g. rand!(Exponential(), zeros(Float32, 20))
.
I think there'll still be a double conversion going on (Float32 -> Float64-> Float32) if you call this on the current public version, but I think a side-effect of merging #1879 will cause things to stay Float32
at all times. Not 100% sure on this though; hadn't given this much thought before.
Hi @devmotion , could you comment on how to proceed?
Hmmmm that's annoying, that leads directly to all the eltype
inconsistencies 😄 Initially - and for basically all distributions apart from Normal
- eltype
and partype
were two completely different things: eltype
was supposed to be the type of samples (Float64
for continuous and Int
for discrete distributions) whereas partype
was the common/promoted type of the parameters.
At some point someone wanted rand(Normal(0f0, 1f0))
to return samples of type Float32
, so the definition of eltype
for Normal
was changed to partype
. But this made it completely inconsistent with how it was supposed to be defined, how it's defined for other distributions - and generally it's not even the type of the samples (think of e.g. Normal{Int}
)!
Sure, we could just propagate this inconsistency a bit more but I'm a bit reluctant. Not because I think that it should return Float64
but because I think this eltype
concept is utterly broken. I think we should just stop defining eltype
- an "element type" of a distribution is not clearly defined: Is it supposed to be the element type of the struct? The return type of rand
? The return type of quantile
? The element type of support
?
Maybe at least for univariate distributions we could start removing it from the rand
pipeline by changing https://github.com/JuliaStats/Distributions.jl/blob/e1340f02b5604cedf02df93b9ae202f3679d9818/src/genericrand.jl#L34-L35 to
return map(_ -> rand(rng, sampler(s)), CartesianIndices(dims))
(probably with an additional function barrier, similar to rand!
)
In the default fallback for rand!
we call scalar rand
in a loop anyway. And defining array rand
for those univariate distributions for which it is beneficial (see #1879) doesn't seem too bad.
but because I think this eltype concept is utterly broken. I think we should just stop defining eltype - an "element type" of a distribution is not clearly defined: Is it supposed to be the element type of the struct? The return type of rand? The return type of quantile? The element type of support?
In a perfect world, I believe eltype
of a distribution should be the type of the value returned by rand
, quantile
, minimum
, maximum
, median
, mode
, etc.; and that mean
and std
should return a result of type float(eltype)
.
It is my view that a Distribution
in Distributions.jl is an infinite-length unordered container which defines rand
as a way to allow users to obtain values. There's only two discrepancies between this view and the status quo that I'm aware of; size
and length
in Distributions.jl return the size/length of a single sample, while I believe that those roles should go to another function while these functions should always return Inf
on a Distribution.
Hi, it seems that
returns a
Float64
instead of aFloat32
. Is this intended behavior? I find it unusual because therand
implementation tries to be type-stable half way there by usingone(T)
but ends up callingrandn(rng)
without specifying the type.