joshday / OnlineStats.jl

⚑ Single-pass algorithms for statistics
https://joshday.github.io/OnlineStats.jl/latest/
MIT License
836 stars 64 forks source link

Variance with Unitful values #206

Closed karlwessel closed 4 years ago

karlwessel commented 4 years ago

Is it possible to use OnlineStats to calculate the Variance of values with units using Unitful?

I tried to define the type for the Variance, but neither the type of the unitful value nor the square of it works:

julia> using OnlineStats

julia> using Unitful

julia> a = 4u"m"
4 m

julia> var = Variance(typeof(a))
Variance: n=0 | value=1.0

julia> fit!(var, a)
ERROR: DimensionError: 0 m and 4 m^2 are not dimensionally compatible.
Stacktrace:
 [1] +(::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}, ::Quantity{Int64,𝐋^2,Unitful.FreeUnits{(m^2,),𝐋^2,nothing}}) at /home/karl/.julia/packages/Unitful/D0cfL/src/quantities.jl:86
 [2] smooth(::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/OnlineStatsBase/5Vchh/src/OnlineStatsBase.jl:132
 [3] _fit!(::Variance{Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}},EqualWeight}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/OnlineStatsBase/5Vchh/src/stats.jl:448
 [4] fit!(::Variance{Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}},EqualWeight}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/OnlineStatsBase/5Vchh/src/OnlineStatsBase.jl:116
 [5] top-level scope at REPL[5]:1

julia> var = Variance(typeof(a^2))
Variance: n=0 | value=1.0

julia> fit!(var, a)
ERROR: DimensionError: m^2 and 4 m are not dimensionally compatible.
Stacktrace:
 [1] convert(::Type{Quantity{Int64,𝐋^2,Unitful.FreeUnits{(m^2,),𝐋^2,nothing}}}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/Unitful/D0cfL/src/conversion.jl:108
 [2] Quantity{Int64,𝐋^2,Unitful.FreeUnits{(m^2,),𝐋^2,nothing}}(::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/Unitful/D0cfL/src/types.jl:145
 [3] _fit!(::Variance{Quantity{Int64,𝐋^2,Unitful.FreeUnits{(m^2,),𝐋^2,nothing}},EqualWeight}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/OnlineStatsBase/5Vchh/src/stats.jl:448
 [4] fit!(::Variance{Quantity{Int64,𝐋^2,Unitful.FreeUnits{(m^2,),𝐋^2,nothing}},EqualWeight}, ::Quantity{Int64,𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}) at /home/karl/.julia/packages/OnlineStatsBase/5Vchh/src/OnlineStatsBase.jl:116
 [5] top-level scope at REPL[7]:1

It works for the Mean:

julia> avrg = Mean(typeof(a))
Mean: n=0 | value=0 m

julia> fit!(avrg, a)
Mean: n=1 | value=4 m

julia> avrg = Mean(typeof(a^2))
Mean: n=0 | value=0 m^2

julia> fit!(avrg, a^2)
Mean: n=1 | value=16 m^2

I would guess that the reason is that Unitful defines the neutral element for multiplication (one) of a unitful number as 1 without any unit. Which is correct. However Variance seems to initialize itself using this value and therefore does not get the right units? It seems to "forget" the units after creation:

julia> var = Variance(typeof(1u"m"))
Variance: n=0 | value=1.0

julia> value(var)
1.0
joshday commented 4 years ago

Hmm, Unitful is an interesting case. I hadn't considered that a type could change from multiplication.

Variance internally stores the mean and variance, both initialized with zero(T), so this was a relatively easy fix of initializing the variance with zero(T) ^ 2.

This is fixed on OnlineStatsBase master, along with a few other things like making value type stable.

julia> a = 4u"m"
4 m

julia> v = Variance(typeof(a))
Variance: n=0 | value=1 mΒ²

julia> fit!(v, a)
Variance: n=1 | value=1 mΒ²
karlwessel commented 4 years ago

That was quick! Thank you!

joshday commented 4 years ago

You're welcome!

Big caveat: you need to use 4.0u"m" instead of 4u"m"

joshday commented 4 years ago

Just kidding, I made it smarter. 4u"m" is fine.