JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.72k stars 5.48k forks source link

sinc and cosc inaccuracy and type-instability #37227

Closed petvana closed 4 years ago

petvana commented 4 years ago

The main motivation to use sinc function is its nice and smooth behavior around zero. However, the current implementation can produce results greater than 1 close to zero, and the results are not even monotonous.

julia> sinc(1e-20)
1.0
julia> sinc(1e-30)
1.0
julia> sinc(1e-40)
1.0000000000000002
julia> sinc(1e-50)
1.0
julia> sinc(1e-60)
1.0000000000000002
julia> sinc(1e-70)
1.0
julia> sinc(1e-80)
1.0000000000000002
julia> sinc(1e-90)
1.0

Possible solution is simple:

sinc(x::Real) = abs(x) < 1e-10 ? one(x) : isinf(x) ? zero(x) : sinpi(x)/(pi*x)

However, this is not general enough for various data types, because the constant 1e-10 does not work for more precise data types.

Julia Version 1.5.0
Commit 96786e22cc (2020-08-01 23:44 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
stevengj commented 4 years ago

Technically, this is not a numerical "instability" (which typically refers to relative errors growing arbitrarily large for well-conditioned functions). But it is not giving an exactly rounded result for small arguments, and I agree that it should be possible to do better here.

stevengj commented 4 years ago

Another problem is that the current implementation is not always type-stable:

julia> sinc(3//4)
0.3001054387190354

julia> sinc(0//4)
1//1
stevengj commented 4 years ago

More generally, as long as we are computing a precision-dependent threshold, one could switch to the Taylor expansion for sufficiently small x, since this is probably slightly faster than sinpi(x)/(pi*x).

sinc(x) = abs(x) < sincthreshold(x) ? @evalpoly(x^2, 1, -pi^2/6, pi^4/120) : isinf(x) ? zero(x) : sinpi(x)/(pi*x)

with some modifications to make this type stable. As you say, determining the threshold is a bit annoying to do in a generic way. For floating-point x, it is determined by abs(x)^6 * pi^6/5040 < eps(typeof(x)). I suppose we could use a generated function for sincthreshold.

petvana commented 4 years ago

This is exactly what I was thinking of but I don't know yet how to write a generated and generic-enough function for that. Notice the problem seems to be common also for cosc function (also defined a as division).

oscardssmith commented 4 years ago

Note that a minimax function will likely work for a larger range at acceptable accuracy than a taylor expansion.

stevengj commented 4 years ago

You mean, something like sinc(x) = ifelse(iszero(x), one(x), min(one(x), sinpi(x)/(pi*x)))?

stevengj commented 4 years ago

@petvana, thanks for pointing that out; I didn't even realize that we had a cosc function. The current implementation of cosc is much worse than that of sinc — it really is numerically unstable, suffering from catastrophic cancellation near zero:

julia> cosc(1e-20)
0.0

julia> cosc(big(1e-20))
-3.289868133696452692506325852996527191782335372598840071672408487965980404206592e-20

i.e. it lost all of the significant digits.

stevengj commented 4 years ago

In @JeffBezanson's defense, this cosc implementation was added way back in 1917197353051c65929126b7afd2b0c38dbb93ea when Julia files still ended with .j; I think floating-point arithmetic hadn't been invented yet 😉 .

oscardssmith commented 4 years ago

@stevengj sorry that was unclear. I meanta minimax polynomial. Check out remez.jl for more details, but basically given a range and degree, they give the polynomial which had least error over the entire range.

stevengj commented 4 years ago

@oscardssmith, ah yes, I see. Yes, I'm familiar with minimax polynomials — more generally one can use minimax rational functions — from other special functions that I've implemented. The main difficulty with a minimax polynomial, however, is that then the whole polynomial is precision-dependent, whereas with a Taylor series only the cutoff is precision dependent (and alternatively for arbitrary precision you can loop until the terms decay sufficiently). Also, in cases like cosc where you are approaching a zero of the function you almost always want a Taylor series in order to maintain a small relative error arbitrarily close to the zero.

In the case of sinc and cosh, therefore, where we only need to improve accuracy near x=0, I suspect that a Taylor-series approach is more important.

But it will be tricky to make the implementations generic to different precisions (while still being fast) in any case. I suppose that we could make the implementation fast if typeof(float(one(real(x)))) is FloatXX and then fall back to a slower generic implementation otherwise.

petvana commented 4 years ago

Btw, function sinc(x)::Float64 is utilized (by a happy coincidence) as an example in the documentation types.md. I would select a better one after this discussion ...

stevengj commented 4 years ago

I think

sinc(x::Number) = iszero(x) ? one(float(x)) : sinpi(x)/(pi*x)
sinc(x::Integer) = iszero(x) ? one(x) : zero(x)
sinc(x::Real) = iszero(x) ? one(float(x)) : isinf(x) ? zero(float(x)) : min(one(float(x)), sinpi(x)/(pi*x))

should probably suffice for sinc if we just want to eliminate the > 1.0 roundoff errors, though it wouldn't hurt to add optimized FloatXX methods.

stevengj commented 4 years ago

Here's an implementation that speeds up the Float32 and Float64 cases by a factor of 2 or more for small arguments, while remaining type stable and (and ≤ 1) for other well-behaved types.

sinc(x::Number) = _sinc(float(x))
sinc(x::Integer) = iszero(x) ? one(x) : zero(x)
_sinc(x::Number) = iszero(x) ? one(x) : x isa Real && isinf(x) ? zero(x) : min(one(x), sinpi(x)/(pi*x))
_sinc_threshold(::Type{Float64}) = 0.001
_sinc_threshold(::Type{Float32}) = 0.05
@inline function _sinc(x::Union{T,Complex{T}}) where {T<:Union{Float32,Float64}}
    a = abs(real(x)) + abs(imag(x))
    return a < _sinc_threshold(T) ? @evalpoly(x^2, T(1), -T(pi)^2/6, T(pi)^4/120) : x isa Real && isinf(a) ? zero(x) : sinpi(x)/(pi*x)
end

The cosc case will take more work, and will be difficult to make fully generic.

petvana commented 4 years ago

I think

sinc(x::Number) = iszero(x) ? one(float(x)) : sinpi(x)/(pi*x)
sinc(x::Integer) = iszero(x) ? one(x) : zero(x)
sinc(x::Real) = iszero(x) ? one(float(x)) : isinf(x) ? zero(float(x)) : min(one(float(x)), sinpi(x)/(pi*x))

should probably suffice for sinc if we just want to eliminate the > 1.0 roundoff errors, though it wouldn't hurt to add optimized FloatXX methods.

This is still problematic because it returns incorrect value for very small values. The Taylor version solves this obviously.

sinc2(x::Real) = iszero(x) ? one(float(x)) : isinf(x) ? zero(float(x)) : min(one(float(x)), sinpi(x)/(pi*x))
sinc2(5.07138898934e-313)

produces 0.9999999999968989

oscardssmith commented 4 years ago

for cosc, -x*pi^2/3+x^3*pi^4/30-x^5*pi^6/840 works for Float32 in (-.08,.08) and for Float64 in (-.003,.003). Dropping a degree from the Float32 version probably isn't worth it as it significantly decreases the convergence radius.

stevengj commented 4 years ago

This is still problematic because it returns incorrect value for very small values.

It's still "correct" in the sense of having a small relative error, it's just not exactly rounded. So it still seems fine to me as a fallback, but obviously it's better to use the Taylor-series version for the common cases of single and double precision.

stevengj commented 4 years ago

for cosc, -x*pi^2/3+x^3*pi^4/30-x^5*pi^6/840 works for Float32 in (-.08,.08) and for Float64 in (-.003,.003).

For 0.004 the current cosc implementation still has an error a bit larger than I would like:

julia> Float64((cosc(0.004) - cosc(big(0.004))) / cosc(big(0.004)))
-5.645431816105822e-13

so I suspect we'll need to go to one higher degree for Float64.

oscardssmith commented 4 years ago

Then the next in the series is x^7*pi^8/45360 The full sequence for denominators is https://oeis.org/A174549 ((2*n-1)! + (2*n)!) if we want to do fancy dynamic stuff.

stevengj commented 4 years ago

Here is a generic routine that should work (albeit relatively slowly) for any type, including BigFloat:

cosc(x::Number) = _cosc(float(x))
function _cosc(x::Number)
    if abs(x) < 0.5
        # generic Taylor series: π ∑ (-πx)^{2n-1}/a(n) where
        # a(n) = (1+2n)*(2n-1)! (= OEIS A174549)
        s = (term = -(π*x))/3
        π²x² = term^2
        ε = eps(abs(term)) # error threshold to stop sum
        n = 1
        while true
            n += 1
            term *= π²x²/((1-2n)*(2n-2))
            s += (δs = term/(1+2n))
            abs(δs) ≤ ε && break
        end
        return π*s
    else
        return x isa Real && isinf(x) ? zero(x) : ((pi*x)*cospi(x)-sinpi(x))/((pi*x)*x)
    end
end

Now I just need to add optimized _cosc methods for Float32 and Float64.

stevengj commented 4 years ago

Here are the Float32 and Float64 optimized versions:

# hard-code Float64/Float32 Taylor series, with coefficients
#  Float64.([(-1)^n*big(pi)^(2n)/((2n+1)*factorial(2n-1)) for n = 1:6])
_cosc(x::Union{Float64,ComplexF64}) =
    abs(real(x))+abs(imag(x)) < 0.14 ? x*@evalpoly(x^2, -3.289868133696453, 3.2469697011334144, -1.1445109447325053, 0.2091827825412384, -0.023460810354558236, 0.001781145516372852) : 
    x isa Real && isinf(x) ? zero(x) : ((pi*x)*cospi(x)-sinpi(x))/((pi*x)*x)
_cosc(x::Union{Float32,ComplexF32}) =
    abs(real(x))+abs(imag(x)) < 0.26 ? x*@evalpoly(x^2, -3.289868f0, 3.2469697f0, -1.144511f0, 0.20918278f0) : 
    x isa Real && isinf(x) ? zero(x) : ((pi*x)*cospi(x)-sinpi(x))/((pi*x)*x)
stevengj commented 4 years ago

I'll try to put together a PR this weekend.

N3N5 commented 4 years ago

Then move the original implement to FastMath?

petvana commented 4 years ago

Then move the original implement to FastMath?

I wouldn't do so. The speed-up of sinc is not so significant, and cosc seems to be even faster than the original (Base) version. Notice also, the original version is significantly faster only nearby zero.

julia> x = rand(10000);

julia> @btime sinc.(x);
  177.529 μs (5 allocations: 78.27 KiB)
julia> @btime Base.sinc.(x);
  171.738 μs (5 allocations: 78.27 KiB)

julia> @btime cosc.(x);
  252.226 μs (5 allocations: 78.27 KiB)
julia> @btime Base.cosc.(x);
  292.728 μs (5 allocations: 78.27 KiB)

julia> @btime sin.(x);
  69.210 μs (5 allocations: 78.27 KiB)
stevengj commented 4 years ago

the original version is significantly faster only nearby zero.

Sorry if I wasn't clear — near zero, the new implementation is faster for both sinc and cosc for FloatXX (and ComplexFXX):

julia> x = rand(Float64, 100)/1000; xf = Float32.(x);

julia> @btime sum(Base.sinc, $x); @btime sum(sinc, $x);
  1.278 μs (0 allocations: 0 bytes)
  403.865 ns (0 allocations: 0 bytes)

julia> @btime sum(Base.sinc, $xf); @btime sum(sinc, $xf);
  992.455 ns (0 allocations: 0 bytes)
  407.615 ns (0 allocations: 0 bytes)

julia> @btime sum(Base.cosc, $x); @btime sum(cosc, $x);
  2.655 μs (0 allocations: 0 bytes)
  425.477 ns (0 allocations: 0 bytes)

julia> @btime sum(Base.cosc, $xf); @btime sum(cosc, $xf);
  2.131 μs (0 allocations: 0 bytes)
  405.568 ns (0 allocations: 0 bytes)

So there should be no reason to use the old version.

stevengj commented 4 years ago

I have to admit that the "cosc function" is new to me, and I wonder how widespread this terminology is?

So far, the only references I can find for our definition of cosc trace to these 1999 seminar notes. Are there any other references? @JeffBezanson, where did you take this definition from when you added it in 2010?

Other definitions include:

That being said, d(sinc)/dx seems to be the most broadly useful definition, now that differentiable programming is all the rage, especially because the naive derivative implementation is numerically unstable and a correct implementation is nontrivial.

JeffBezanson commented 4 years ago

I don't remember, but it's possible I got it from those very notes --- I probably learned about it in the context of windowing functions for image reconstruction. I'm pretty sure we used the derivative definition in the MRI lab.

petvana commented 4 years ago

Just for the record, I found sinc first in a recent paper A novel formalisation of the Markov-Dubins problem published robotic conference this year. It enables to model both straight and turn trajectory segments by a single equation.

stevengj commented 4 years ago

@petvana, the sinc function is totally standard (modulo two conventions about π factors) — it's only cosc that I was wondering about.