Closed petvana closed 4 years ago
Technically, this is not a numerical "instability" (which typically refers to relative errors growing arbitrarily large for well-conditioned functions). But it is not giving an exactly rounded result for small arguments, and I agree that it should be possible to do better here.
Another problem is that the current implementation is not always type-stable:
julia> sinc(3//4)
0.3001054387190354
julia> sinc(0//4)
1//1
More generally, as long as we are computing a precision-dependent threshold, one could switch to the Taylor expansion for sufficiently small x
, since this is probably slightly faster than sinpi(x)/(pi*x)
.
sinc(x) = abs(x) < sincthreshold(x) ? @evalpoly(x^2, 1, -pi^2/6, pi^4/120) : isinf(x) ? zero(x) : sinpi(x)/(pi*x)
with some modifications to make this type stable. As you say, determining the threshold is a bit annoying to do in a generic way. For floating-point x
, it is determined by abs(x)^6 * pi^6/5040 < eps(typeof(x))
. I suppose we could use a generated function for sincthreshold
.
This is exactly what I was thinking of but I don't know yet how to write a generated and generic-enough function for that. Notice the problem seems to be common also for cosc function (also defined a as division).
Note that a minimax function will likely work for a larger range at acceptable accuracy than a taylor expansion.
You mean, something like sinc(x) = ifelse(iszero(x), one(x), min(one(x), sinpi(x)/(pi*x)))
?
@petvana, thanks for pointing that out; I didn't even realize that we had a cosc
function. The current implementation of cosc
is much worse than that of sinc
— it really is numerically unstable, suffering from catastrophic cancellation near zero:
julia> cosc(1e-20)
0.0
julia> cosc(big(1e-20))
-3.289868133696452692506325852996527191782335372598840071672408487965980404206592e-20
i.e. it lost all of the significant digits.
In @JeffBezanson's defense, this cosc
implementation was added way back in 1917197353051c65929126b7afd2b0c38dbb93ea when Julia files still ended with .j
; I think floating-point arithmetic hadn't been invented yet 😉 .
@stevengj sorry that was unclear. I meanta minimax polynomial. Check out remez.jl for more details, but basically given a range and degree, they give the polynomial which had least error over the entire range.
@oscardssmith, ah yes, I see. Yes, I'm familiar with minimax polynomials — more generally one can use minimax rational functions — from other special functions that I've implemented. The main difficulty with a minimax polynomial, however, is that then the whole polynomial is precision-dependent, whereas with a Taylor series only the cutoff is precision dependent (and alternatively for arbitrary precision you can loop until the terms decay sufficiently). Also, in cases like cosc
where you are approaching a zero of the function you almost always want a Taylor series in order to maintain a small relative error arbitrarily close to the zero.
In the case of sinc
and cosh
, therefore, where we only need to improve accuracy near x=0
, I suspect that a Taylor-series approach is more important.
But it will be tricky to make the implementations generic to different precisions (while still being fast) in any case. I suppose that we could make the implementation fast if typeof(float(one(real(x))))
is FloatXX
and then fall back to a slower generic implementation otherwise.
Btw, function sinc(x)::Float64
is utilized (by a happy coincidence) as an example in the documentation types.md. I would select a better one after this discussion ...
I think
sinc(x::Number) = iszero(x) ? one(float(x)) : sinpi(x)/(pi*x)
sinc(x::Integer) = iszero(x) ? one(x) : zero(x)
sinc(x::Real) = iszero(x) ? one(float(x)) : isinf(x) ? zero(float(x)) : min(one(float(x)), sinpi(x)/(pi*x))
should probably suffice for sinc
if we just want to eliminate the > 1.0
roundoff errors, though it wouldn't hurt to add optimized FloatXX
methods.
Here's an implementation that speeds up the Float32
and Float64
cases by a factor of 2 or more for small arguments, while remaining type stable and (and ≤ 1) for other well-behaved types.
sinc(x::Number) = _sinc(float(x))
sinc(x::Integer) = iszero(x) ? one(x) : zero(x)
_sinc(x::Number) = iszero(x) ? one(x) : x isa Real && isinf(x) ? zero(x) : min(one(x), sinpi(x)/(pi*x))
_sinc_threshold(::Type{Float64}) = 0.001
_sinc_threshold(::Type{Float32}) = 0.05
@inline function _sinc(x::Union{T,Complex{T}}) where {T<:Union{Float32,Float64}}
a = abs(real(x)) + abs(imag(x))
return a < _sinc_threshold(T) ? @evalpoly(x^2, T(1), -T(pi)^2/6, T(pi)^4/120) : x isa Real && isinf(a) ? zero(x) : sinpi(x)/(pi*x)
end
The cosc
case will take more work, and will be difficult to make fully generic.
I think
sinc(x::Number) = iszero(x) ? one(float(x)) : sinpi(x)/(pi*x) sinc(x::Integer) = iszero(x) ? one(x) : zero(x) sinc(x::Real) = iszero(x) ? one(float(x)) : isinf(x) ? zero(float(x)) : min(one(float(x)), sinpi(x)/(pi*x))
should probably suffice for
sinc
if we just want to eliminate the> 1.0
roundoff errors, though it wouldn't hurt to add optimizedFloatXX
methods.
This is still problematic because it returns incorrect value for very small values. The Taylor version solves this obviously.
sinc2(x::Real) = iszero(x) ? one(float(x)) : isinf(x) ? zero(float(x)) : min(one(float(x)), sinpi(x)/(pi*x))
sinc2(5.07138898934e-313)
produces
0.9999999999968989
for cosc, -x*pi^2/3+x^3*pi^4/30-x^5*pi^6/840
works for Float32
in (-.08,.08)
and for Float64
in (-.003,.003)
. Dropping a degree from the Float32
version probably isn't worth it as it significantly decreases the convergence radius.
This is still problematic because it returns incorrect value for very small values.
It's still "correct" in the sense of having a small relative error, it's just not exactly rounded. So it still seems fine to me as a fallback, but obviously it's better to use the Taylor-series version for the common cases of single and double precision.
for cosc,
-x*pi^2/3+x^3*pi^4/30-x^5*pi^6/840
works forFloat32
in(-.08,.08)
and forFloat64
in(-.003,.003)
.
For 0.004
the current cosc
implementation still has an error a bit larger than I would like:
julia> Float64((cosc(0.004) - cosc(big(0.004))) / cosc(big(0.004)))
-5.645431816105822e-13
so I suspect we'll need to go to one higher degree for Float64
.
Then the next in the series is x^7*pi^8/45360
The full sequence for denominators is https://oeis.org/A174549 ((2*n-1)! + (2*n)!
) if we want to do fancy dynamic stuff.
Here is a generic routine that should work (albeit relatively slowly) for any type, including BigFloat
:
cosc(x::Number) = _cosc(float(x))
function _cosc(x::Number)
if abs(x) < 0.5
# generic Taylor series: π ∑ (-πx)^{2n-1}/a(n) where
# a(n) = (1+2n)*(2n-1)! (= OEIS A174549)
s = (term = -(π*x))/3
π²x² = term^2
ε = eps(abs(term)) # error threshold to stop sum
n = 1
while true
n += 1
term *= π²x²/((1-2n)*(2n-2))
s += (δs = term/(1+2n))
abs(δs) ≤ ε && break
end
return π*s
else
return x isa Real && isinf(x) ? zero(x) : ((pi*x)*cospi(x)-sinpi(x))/((pi*x)*x)
end
end
Now I just need to add optimized _cosc
methods for Float32
and Float64
.
Here are the Float32 and Float64 optimized versions:
# hard-code Float64/Float32 Taylor series, with coefficients
# Float64.([(-1)^n*big(pi)^(2n)/((2n+1)*factorial(2n-1)) for n = 1:6])
_cosc(x::Union{Float64,ComplexF64}) =
abs(real(x))+abs(imag(x)) < 0.14 ? x*@evalpoly(x^2, -3.289868133696453, 3.2469697011334144, -1.1445109447325053, 0.2091827825412384, -0.023460810354558236, 0.001781145516372852) :
x isa Real && isinf(x) ? zero(x) : ((pi*x)*cospi(x)-sinpi(x))/((pi*x)*x)
_cosc(x::Union{Float32,ComplexF32}) =
abs(real(x))+abs(imag(x)) < 0.26 ? x*@evalpoly(x^2, -3.289868f0, 3.2469697f0, -1.144511f0, 0.20918278f0) :
x isa Real && isinf(x) ? zero(x) : ((pi*x)*cospi(x)-sinpi(x))/((pi*x)*x)
I'll try to put together a PR this weekend.
Then move the original implement to FastMath?
Then move the original implement to FastMath?
I wouldn't do so. The speed-up of sinc
is not so significant, and cosc
seems to be even faster than the original (Base) version. Notice also, the original version is significantly faster only nearby zero.
julia> x = rand(10000);
julia> @btime sinc.(x);
177.529 μs (5 allocations: 78.27 KiB)
julia> @btime Base.sinc.(x);
171.738 μs (5 allocations: 78.27 KiB)
julia> @btime cosc.(x);
252.226 μs (5 allocations: 78.27 KiB)
julia> @btime Base.cosc.(x);
292.728 μs (5 allocations: 78.27 KiB)
julia> @btime sin.(x);
69.210 μs (5 allocations: 78.27 KiB)
the original version is significantly faster only nearby zero.
Sorry if I wasn't clear — near zero, the new implementation is faster for both sinc
and cosc
for FloatXX
(and ComplexFXX
):
julia> x = rand(Float64, 100)/1000; xf = Float32.(x);
julia> @btime sum(Base.sinc, $x); @btime sum(sinc, $x);
1.278 μs (0 allocations: 0 bytes)
403.865 ns (0 allocations: 0 bytes)
julia> @btime sum(Base.sinc, $xf); @btime sum(sinc, $xf);
992.455 ns (0 allocations: 0 bytes)
407.615 ns (0 allocations: 0 bytes)
julia> @btime sum(Base.cosc, $x); @btime sum(cosc, $x);
2.655 μs (0 allocations: 0 bytes)
425.477 ns (0 allocations: 0 bytes)
julia> @btime sum(Base.cosc, $xf); @btime sum(cosc, $xf);
2.131 μs (0 allocations: 0 bytes)
405.568 ns (0 allocations: 0 bytes)
So there should be no reason to use the old version.
I have to admit that the "cosc
function" is new to me, and I wonder how widespread this terminology is?
So far, the only references I can find for our definition of cosc
trace to these 1999 seminar notes. Are there any other references? @JeffBezanson, where did you take this definition from when you added it in 2010?
Other definitions include:
This 2014 paper defines cosc(x)
as the Hilbert transform of sinc(x)
, i.e. cosc(x) = (1-cospi(x))/(π*x) = 2sinpi(x/2)^2 / (π*x)
(where the latter formulation is free from cancellation error), citing a 1989 paper (that also defines a "zinc" function).
This paper defines cosc(x)
as the obvious but uninteresting cos(x)/x
.
That being said, d(sinc)/dx
seems to be the most broadly useful definition, now that differentiable programming is all the rage, especially because the naive derivative implementation is numerically unstable and a correct implementation is nontrivial.
I don't remember, but it's possible I got it from those very notes --- I probably learned about it in the context of windowing functions for image reconstruction. I'm pretty sure we used the derivative definition in the MRI lab.
Just for the record, I found sinc
first in a recent paper A novel formalisation of the Markov-Dubins problem published robotic conference this year. It enables to model both straight and turn trajectory segments by a single equation.
@petvana, the sinc
function is totally standard (modulo two conventions about π factors) — it's only cosc
that I was wondering about.
The main motivation to use sinc function is its nice and smooth behavior around zero. However, the current implementation can produce results greater than 1 close to zero, and the results are not even monotonous.
Possible solution is simple:
However, this is not general enough for various data types, because the constant 1e-10 does not work for more precise data types.