Closed StefanKarpinski closed 10 years ago
There is a digamma for complex argument in Naval Surface Warfare library which a Julia implementation could be based on, but no trigamma or general polygamma.
Some references on computing the polygamma function for complex arguments can be found in Lozier and Olver, although when I look at the cited references it looks like most of them are only for the gamma or digamma function of complex arguments. I'm not seeing any recent publications on this, but these continued-fraction expansions may be helpful.
There is also the CPSIPG program. And this technical report, if you can find it (or order it through your library).
K. S. Kölbig, "Programs for computing the logarithm of the gamma function, and the digamma function, for complex argument," Computer Phys. Commun. vol. 4, pp. 221–226 (1972), uses the following formula to compute the digamma function (based on the Stirling series of ln Γ): where the Bₙ are the Bernoulli numbers, the Rk term is a remainder that is dropped, and z=x+iy. This also seems to be essentially the algorithm employed by SLATEC's CPSI routine.
It seems like it would be straightforward to write down a recurrence for the n-th derivative of this expression in order to obtain the general polygamma function, especially since it looks like it will converge faster for higher derivatives (i.e. you will need smaller and smaller K).
The following code computes the digamma
function accurately (as far as I can tell from comparison with Wolfram Alpha) to double precision in the complex plane:
import Base: digamma, Math.@horner
function digamma(z::Complex{Float64})
# Based on eq. (12), without looking at the accompanying source code, of:
# K. S. Kölbig, "Programs for computing the logarithm of the gamma function,
# and the digamma function, for complex argument," Computer Phys. Commun.
# vol. 4, pp. 221–226 (1972).
x, y = reim(z)
if x < 0 # reflection formula
ψ = -π * cot(π*z)
x = 1 - x; y = -y
z = Complex(x, y)
else
ψ = zero(z)
end
if x < 7 # shift using recurrence formula
n = 7 - ifloor(x)
for ν = 0:n-1
ψ -= inv(z + ν)
end
z = Complex(x + n, y)
end
t = inv(z)
ψ += log(z) - 0.5*t
t *= t # 1/z^2
ψ -= t * @horner(t,0.08333333333333333,-0.008333333333333333,0.003968253968253968,-0.004166666666666667,0.007575757575757576,-0.02109279609279609,0.08333333333333334)
end
digamma{T <: Integer}(z::Complex{T}) = digamma(float(z))
Note that I used a slightly different version of the reflection formula for real(z)<0
than the one in Kölbig: it uses the recurrence formula to save a 1/z
term.
The @horner
coefficients are computed from:
const A000367 = map(BigInt, split("1,1,-1,1,-1,5,-691,7,-3617,43867,-174611,854513,-236364091,8553103,-23749461029,8615841276005,-7709321041217,2577687858367,-26315271553053477373,2929993913841559,-261082718496449122051", ","))
const A002445 = [1,6,30,42,30,66,2730,6,510,798,330,138,2730,6,870,14322,510,6,1919190,6,13530]
const bernoulli = map((a,b) -> a//b, A000367, A002445) # even-index Bernoulli numbers
const bernoulli64 = float64(bernoulli)
println(bernoulli64[2:8] ./ (2*(1:7)))
Gotta love that Horner macro. If we're not exporting that thing, we should.
As described in Knuth TAOCP vol. 2, sec. 4.6.4, there is actually an algorithm even better than Horner's rule for evaluating polynomials p(z)
at complex arguments (but with real coefficients): you can save almost a factor of two for high degrees. It is so complicated that it is basically only usable via code generation, so it would be especially nice to modify the @horner
macro to switch to this for complex arguments.
(I should have a general polygamma function implemented soon if nothing unexpected happens.)
Is that the right link to the Horner alternative?
Yes. There is a description of the algorithm near the end of the thread. I couldn't find a better description online.
Here's a @horner
-like macro that implements Knuth's algorithm. As you can see (try changing macro
to function
to see the resulting expression), this algorithm is extremely annoying to use efficiently (i.e. inlined) without code generation:
# evaluate p[1] + z * (p[2] + z * (....)) for complex z and real p.
# Instead of Horner's rule, we use the algorithm in Knuth TAOCP
# vol. 2, sec. 4.6.4, which has lower arithmetic costs. Assumes
# degree(p) > 0 and that p consists of real numeric literals.
macro chorner(z, p...)
a = p[end]
b = p[end-1]
as = {}
for i = length(p)-2:-1:1
ai = symbol(string("a", i))
push!(as, :($ai = $a))
a = :($b + r*$ai)
b = :($(p[i]) - s * $ai)
end
ai = :a0
push!(as, :($ai = $a))
Expr(:block,
:(x = real($(esc(z)))),
:(y = imag($(esc(z)))),
:(r = x + x),
:(s = x*x + y*y),
as...,
:(Complex($ai * x + $b, $ai * y)))
end
This is probably less efficient for non-complex arguments though. Or does the fact that the imaginary part is zero allow code generation to eliminate any extra code?
My digamma
function above is almost 4x faster with the @chorner
macro compared to @horner
. I had no idea the speed difference was that great! (Because, without code generation, I've never bothered using Knuth's algorithm before.)
Although a part of the speed difference could be due to the fact that @chorner
avoids Complex
arithmetic entirely, and probably more stuff can be kept in registers just because of type issues.
I don't think you'd want to use it for real arguments, though; as written, it always produces a Complex
result, nor do I think the compiler could be smart enough to eliminate the extra work it involves (naïvely, it looks like about double the number operations for real data compared to vanilla horner).
Hmm, it would be nice to be able to support polygamma(n, z)
for arbitrary complex orders n
. I think I can do it, except for real(z) < 0
: I'm not sure how to compute fractional derivatives of cot
(without using the polygamma function).
That's a phenomenal speed up. Good code generation is pretty amazing. I wonder if there's some way to make the Horner macro work for complex or real arguments without the user needing to know. Maybe generate both versions with a conditional on the type? Type inference should typically eliminate the branch.
Type-inference apparently does not currently eliminate the branch, unfortunately; see #7060.
Hmm, looks like Julia's polygamma(m, x)
throws a domain error for x < 0
, which is a little bogus. I guess that way you avoid the annoyance of computing the m-th derivative of a cotangent, but it doesn't seem like correct behavior.
It is based on the SLATEC function dpsifn
written by Amos. I browsed through the paper and he doesn't discuss negative arguments.
Even for polynomial evaluation for real arguments, Knuth sec. 4.6.4 describes ways to save a few operations (or in some cases just to trade off multiplies for adds) compared to Horner for polynomials of degree 4 or higher. If we suppose that multiplication and addition have approximately the same cost (which is not a bad assumption these days), you can save 1 flop for evaluating polynomials of degree 5 or 6. Asymptotically, you can save 1/4 of the flops, but I'm not sure we care about @horner
for degrees much more than 6. Again, it involves fairly complicated preprocessing of the coefficients and would be a pain without code generation, but may be worthwhile to implement in @horner
.
As a warmup for a general polygamma
function, here is a trigamma
function that handles complex (and negative) arguments, and is 10x faster for real arguments than our current trigamma(x)
. It uses a modified version of @chorner
that switches to @horner
for real arguments.
import Base: Math.@horner
macro chorner(z, p...)
a = p[end]
b = p[end-1]
as = {}
for i = length(p)-2:-1:1
ai = symbol(string("a", i))
push!(as, :($ai = $a))
a = :($b + r*$ai)
b = :($(p[i]) - s * $ai)
end
ai = :a0
push!(as, :($ai = $a))
C = Expr(:block,
:(x = real($(esc(z)))),
:(y = imag($(esc(z)))),
:(r = x + x),
:(s = x*x + y*y),
as...,
:(Complex($ai * x + $b, $ai * y)))
R = Expr(:macrocall, symbol("@horner"), z, p...)
:(isa($(esc(z)), Real) ? $R : $C)
end
function trigamma2(z::Union(Float64,Complex{Float64}))
# via the derivative of the Kölbig digamma formulation
x = real(z)
if x < 0 # reflection formula
return 9.869604401089358 * csc(π*z)^2 - # π² csc²(πz)
trigamma2(1 - z)
end
ψ = zero(z)
if x < 7 # shift using recurrence formula
n = 7 - ifloor(x)
ψ += inv(z)^2
for ν = 1:n-1
ψ += inv(z + ν)^2
end
z += n
end
t = inv(z)
w = t * t # 1/z^2
ψ += t + 0.5*w
ψ += t*w * @chorner(w,0.16666666666666666,-0.03333333333333333,0.023809523809523808,-0.03333333333333333,0.07575757575757576,-0.2531135531135531,1.1666666666666667,-7.092156862745098)
end
trigamma2{T <: Integer}(z::Union(T,Complex{T})) = trigamma2(float(z))
@vectorize_1arg Number trigamma2
Also, it looks like our current polygamma(m,x)
function may be extremely inaccurate for larger m
. e.g. polygamma(20,7.0)
returns -7.84694453213623e-37
, but Wolfram Alpha says the answer is about -4.64461602724054326256119881499858715254724853920972329364143...
.
The SciPy polygamma
function returns -4.644616027240542
, which is correct to machine precision.
Okay, I have a working polygamma
replacement for both real and complex arguments that seems to be about 3x faster than our current polygamma
for real arguments and appears to be much more accurate (e.g. it gives accurate results for order 20
).
Still working on support for arguments with negative real parts, and for non-integer order.
Hmm, it's somewhat disturbing that Mathematica and Maple don't agree on the value of polygamma(3.3, 12.1 + 1im)
. And my answer disagrees with both of them, so I'm not sure what the correct one is.
Oops. I can see that I have divided with the scaling instead of multiplying in the code for polygamma
. The scale is one for di and tri and hence it hasn't been noticed. See line 311. If you write
julia> (Base.Math.psifn(7.0,20,1,1)*gamma(21))[1]
4.644616027240543
i.e. the same as SciPy.
Thanks @andreasnoack, that's good to know! I'll patch it now, although hopefully shortly I'll have a complete replacement.
Oddly, both both Maple and Mathematica are giving real (and different!) results for polygamma(3.3, 12.1)
, even though the definition in terms of the Hurwitz zeta function in the literature seems to imply a complex result (which agrees with my result, hooray!). It is looking almost like both Mathematica and Maple are wrong (in different ways!) for non-integer order. Weird.
@alanedelman points out that there are, unfortunately, multiple extensions of the polygamma function to arbitrary complex orders m
, and it seems that Mathematica, at least, is using a different choice.
Probably the best thing to do for now is to continue to restrict polygamma
to nonnegative integer orders m
. Absent an application, it's hard to know what extension we should use, and I'm not sure how to compute Mathematica's extension efficiently.
Regarding evaluation of polynomials efficiently (more efficiently than Horner's rule)... The transformed expression will in general evaluate to different values. Maybe it doesn't matter.
Compile-time reorganization of code for all kinds of things is possible -- not only for Horner, but e.g. FFT where there is some known collection of zero elements. Google for accurate polynomial evaluation for some references. Basically if you want to evaluate accurately near x=x0, rearrange as a taylor series in (x-x0), You can also speed things up by splitting into parallel computations, e.g. even and odd exponents, gives you 2 threads.
RJF
This is a good point – one that I'm sure @stevengj is quite aware of. It's worth considering both accuracy and speed, but my impression is that these optimizations are not bad for accuracy – fewer operations tends to lead to better accuracy since there are fewer opportunities for cancelation and round off errors. The specifics matter quite a lot, of course.
The Knuth scheme for complex polynomial evaluation is based on the Goertzel algorithm, which has O(n^2) mean error growth. This is indeed worse than Horner, which is O(n). But these asymptotic considerations aren't very relevant for n < 10. The error growth also depends on the point z being evaluated---I think the worst case is for z on the unit circle with comparable coefficients, where the abovementioned growth rates are achieved. Things should be much better in applications like the one here, where the terms have rapidly decreasing weight
For large n >> 10 (evaluating polynomials of high degree), this difference is important, but then there are algorithms with even better error growth at some cost in arithmetic. The most famous case of high degree polynomial evaluation is an FFT, where the typical algorithm has O(sqrt log n) error growth
We have complex gamma and real polygamma but not complex polygamma – or digamma or trigamma. If anyone wants to tackle this, we really ought to have these. While we're at it, it might be good to readjust the definitions so that if you provide polygamma for a type, you get digamma and trigamma "for free". Currently you have to explicitly add methods for those too for any new type.