Open pdeffebach opened 6 years ago
I encountered a similar issue when using this package with missing values in 0.6.4:
julia> versioninfo()
Julia Version 0.6.4
Commit 9d11f62bcb (2018-07-09 19:09 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin14.5.0)
CPU: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=16)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
julia> using NaNMath
julia> NaNMath.log(-100)
NaN
julia> NaNMath.log(missing)
ERROR: MethodError: no method matching log(::Missings.Missing)
You may have intended to import Base.log
Closest candidates are:
log(::Float32) at /Users/martink/.julia/v0.6/NaNMath/src/NaNMath.jl:12
log(::Float64) at /Users/martink/.julia/v0.6/NaNMath/src/NaNMath.jl:11
log(::Real) at /Users/martink/.julia/v0.6/NaNMath/src/NaNMath.jl:13
...
julia> Base.log(missing)
missing
Would it make sense to define functions in this packages to return missing
for missing
input values? This would make sense to me, as missing
and NaN
here represent two different scenarios: NaN
stands for invalid input (e.g., negative values for log
), whereas missing
stands for entirely absent input.
log
here could be defined for Union{Missing, Float64}
input types and have a simple ismissing
check within its definition. I'm happy to work on this if this is a desired feature of the package.
I see how this is useful, but handling missing
extends beyond the problem that this package was designed to address and that I have interest in maintaining. Maybe MissingNaNMath is in order? :)
Fair enough!
But - given that missing
is part of Base Julia, is there any use for a numerical calculation package that does not support it? That sounds quite dangerous.
is there any use for a numerical calculation package that does not support it? That sounds quite dangerous.
This comment is a bit hyperbolic. The package is already being used by upstream AD packages like ForwardDiff and JuMP (https://github.com/JuliaOpt/JuMP.jl/blob/82e998f72e849f7a5f8c264accb41d9eafbcc845/src/Derivatives/Derivatives.jl#L15) for well-defined purposes that currently don't require any support for missing
.
I see you're right this package is still equally useful in that context. I guess I'm just disappointed as I thought this could be a more broadly useful package, and most packages that would depend on this will need to be able to support all the numerical functionality defined in Base for these functions.
My suggestion that it might be "dangerous" came from the notion that users would expect Base.mean and NaNMath.mean to behave similarly except in their treatment of NaN.
One solution that is missing
-agnostic is to have fallbacks for non-float types that revert to the functions defined in Base
. But you might want to throw errors when non-floats are used given the purpose of NaNMath.
I view a MethodError
as pretty low on the dangerousness scale. Silently not handling NaN
s as promised is more dangerous, e.g., https://github.com/mlubin/NaNMath.jl/pull/21#issuecomment-305064958.
Anyway I'm willing to share maintainership if there's interest in making the package more useful for additional applications.
You're right. After looking into it I'm also less certain that the right approach is to change this to treat missing
like NaN
. Sorry for the noise.
It would make sense to accept ~arrays~ iterables of any element type though, and only throw an error when actually encountering a value of an unsupported type. That's what Statistics.mean
does, and it's useful if you want to pass skipmissing(x)
to NaNMath.mean
.
I tested out this package on 0.7 with
missings
and got the following error:I think this is because
Statistics
is not imported byNaNMath
, so the function calls toNaNMath.mean
can't fall base onStatistics.mean
.I think the solution here is to add
import Statistics
and add fall backs for operations of typeAny
so those are called when there are missing values. Does that seem feasible?