Open Crown421 opened 1 year ago
I'm not sure what you mean by type info is lost. eltype
is used primarily for iteration, which isn't defined (e.g. for i in Mean()...
is an error)
To your second point, FitNormal(Variance(Float32))
works, but I suppose the shorter FitNormal(T)
would be nice to have.
In my specific use case I am using EnsembleProblem
from SciML and reducing the results with OnlineStats, as I want to compute a lot of trajectories in a way that doesn't blow up my RAM.
My current implementation returns a Vector{<:OnlineStat}
for the trajectory (which may or may not be the best option, but we will see)
However, when constructing the solution object, a eltype(eltype(T))
happens, which makes the solution parametrized with Any
, which is not great.
Long story short, I had
>eltype(eltype(Float64))
Float64
as reference for the behaviour I had been expecting, and was hence surprised.
Hmm, okay.
Where is the eltype(eltype(T))
happening/why is that necessary? I'm trying to understand the use case since OnlineStats aren't iterable to begin with.
I'm not sure what a "trajectory" is in this context, but maybe you want to use value.(trajectory)
instead of the stats directly?
A trajectory in the ODE/ dynamical system sense, where one might have m
states, each with dimension d
.
This could be a scalar ODE, so each state would be a Number
, or something higher dimensional, in which case each state is a Vector{<:Number}
. The whole trajectory is then a Vector{<: Number}
or a Vector{Vector{<:Number}
Now, for something like a SDE, each solution might be slightly different, and one wants summary statistics for a (large) collection of trajectories for the distribution of states at each time step.
The way I went about this is to have a Vector{<:OnlineStat}
, i.e. by doing [FitNormal() for _ in 1:m]
and add trajectories via broadcasting. Once the simulation is done, I can nicely get the values out by broadcasting mean.(..)
, cov.(...)
or similar.
I suppose I could do this via Group
, but it does not seem like there is a great constructor for large groups (but I might have missed something).
Even then, if I do something like
> g = Group(FitNormal(), FitNormal())
> fit!(g, rand(2))
I can't get the means out as easily as both mean.(g)
and mean(g)
don't work, so I have to go via value
.
Further, even though Group
is iterable, we again get
> eltype(g)
Any
This is sensible, since a group could contain anything, but in a case like this, where all stats in the group are the same, one might expect a more specific eltype
.
Also comparing to Distributions
:
> eltype(Distributions.Normal(2.f0))
Float32
> eltype(Distributions.MvNormal([2.f0, 3.f0]))
Float32
Given that FitNormal
and Normal
otherwise function quite similar, it is again surprising to see a difference here.
I think that eltypes
are quite useful beyond iterating to indicate what kind of data is wrapped in an object.
Thanks for the info!
I'll have to mull this over a bit since I'd rather not add methods to the OnlineStatsBase interface if I can avoid it.
I just took a stab at creating a convenience constructor (see #258), but stumbled over additional surprising behaviour.
First, the internal type of FitMvNormal
is fixed to CovMatrix{Float64}
, and second the fallback does not incorporate type information even when it can be specified (i.e. for FitNormal
).
julia> m = FitNormal(Variance(Float32))
FitNormal: n=0 | value=(0.0, 1.0)
julia> typeof(value(m))
Tuple{Float64, Float64}
julia> for _ in 1:3
fit!(m, rand(Float32))
end
julia> m
FitNormal: n=3 | value=(0.482926, 0.478244)
julia> typeof(value(m))
Tuple{Float32, Float32}
I also note that
julia> typeof(m.v)
Variance{Float32, Float32, EqualWeight}
which suggests that it is possible to have a Float32
mean and a Float64
variance?
I have made an attempt to fix the above, let me know what you think.
On that note, I am using Float32/ Float64 as placeholders, that could also be replaced with any new user-defined type NewScalarNumberType <: Real
. This might be quite interesting.
I have found this repo recently, and as I am integrating it into my code, I noticed that a lot of type information is lost.
I.e.
which is surprising, given that
Mean
has a<:Number
type parameter. I personally would expect thatSurprisingly other objects like
FitNormal
don't allow a type parameter, even though it is parametrized withV<:Variance
, so one might expect something liketo work.
I am not sure when I would have time to work on something like this, but I first wanted to open this issue, and see if the above would be a desired behaviour.