JuliaMath / NaNMath.jl

Julia math built-ins which return NaN and accumulator functions which ignore NaN
Other
52 stars 26 forks source link

Performance compared to other implementations & supported types #63

Open johanbluecreek opened 1 year ago

johanbluecreek commented 1 year ago

Hi,

I noticed that if I implement my own log function where I take care of the branch-cut by hand, it performs better than the NaNMath.log:

julia> using NaNMath

julia> using BenchmarkTools

julia> function log_nan(x::T)::T where {T<:Real}
           x <= T(0) && return T(NaN)
           return log(x)
       end
log_nan (generic function with 1 method)

julia> X = rand(Float64, 50_000_000) .* 2 .- 1;

julia> @btime NaNMath.log.(X);
  594.144 ms (5 allocations: 381.47 MiB)

julia> @btime log_nan.(X);
  436.285 ms (5 allocations: 381.47 MiB)

this shows that NaNMath.log performs some ~36% slower than log_nan. (This issue was discovered in the discussion here [1])

An implementation like log_nan above also provides support for Float16 and BigFloat, which both gives StackOverflowError error for NaNMath.log:

julia> NaNMath.log(Float16(0.123))
ERROR: StackOverflowError:
Stacktrace:
 [1] log(x::Float16) (repeats 79984 times)
   @ NaNMath ~/.julia/packages/NaNMath/fmhcd/src/NaNMath.jl:10

julia> NaNMath.log(BigFloat(0.123))
ERROR: StackOverflowError:
Stacktrace:
 [1] log(x::BigFloat) (repeats 79984 times)
   @ NaNMath ~/.julia/packages/NaNMath/fmhcd/src/NaNMath.jl:10

I have only tested log, other functions in NaNMath may have similar issues.

[1] https://github.com/MilesCranmer/SymbolicRegression.jl/issues/109

MilesCranmer commented 1 year ago

Seems like this is the same for other functions which make ccalls to libm: https://github.com/JuliaMath/NaNMath.jl/blob/65a59288ab07211517bab671253d76644b204564/src/NaNMath.jl#L6-L13

e.g., a trivial implementation of sin_nan is much faster than NaNMath.sin:

function sin_nan(x::T)::T where {T}
     isfinite(x) || return T(NaN)
     return sin(x)
 end
@btime NaNMath.sin(1.0)
# > 9.758 ns (0 allocations: 0 bytes)
@btime sin_nan(1.0)
# >  1.492 ns (0 allocations: 0 bytes)

However, some functions are the same performance, like NaNMath.sqrt, but, looking at the code, this is a function which is written with a domain check: https://github.com/JuliaMath/NaNMath.jl/blob/65a59288ab07211517bab671253d76644b204564/src/NaNMath.jl#L17

(side note - why doesn't that sqrt convert NaN to the same type as the input?)

hurricane007 commented 1 year ago

this method

       function log_nan(x::T)::T where {T<:Real}
           x <= T(0) && return T(NaN)
           return log(x)
       end

is indeed faster

julia> @benchmark log_nan(-0.1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  0.800 ns … 132.400 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.000 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.142 ns ±   1.780 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           █
  ▂▁▁▁▂▁▁▁▁█▁▁▁▃▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▂▁▁▁▃ ▂
  0.8 ns          Histogram: frequency by time         2.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark NaNMath.log(-0.1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  2.800 ns … 123.000 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     3.400 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.425 ns ±   2.681 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁ █ ▅▂  ▇ ▇                                                 ▂
  █▃█▁██▁▇█▁█▆▄▁▅▁▁▇▆▁▄▄▁▃▄▁▁▁▁▁▃▁▁▃▁▁▁▁▁▁▃▃▁▁▁▁▁▁▆▁▆▃▁▄▁▁▃█▆ █
  2.8 ns       Histogram: log(frequency) by time       6.9 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

but there will be an error:

julia> log_nan(-1)
ERROR: InexactError: Int64(NaN)
MilesCranmer commented 1 year ago

@hurricane007 not sure I see your issue. Just define

log_nan(x::Integer) = log_nan(Float64(x))

log won't return integers anyways

hurricane007 commented 1 year ago

@MilesCranmer Hi Miles, it is about the method from @johanbluecreek