Closed andreasvarga closed 4 years ago
The gcd calculation bumps into floating point issues quite quickly. If you have some ideas how to modify the current algorithm (the truncate call perhaps?) why not try them out and put in a PR. I'm not sure how to do so. At one point I tried implementing an approximate gcd algorithm based on https://doi.org/10.1090/S0025-5718-04-01692-8, but that code has some bitrot now and doesn't seem to help here.
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(real(T)),) where {T <: Number, S <: Number }
# d = gcd1(a,b; atol = 0, rtol)
d
of two polynomials a
andb
using Blankinship's algorithm.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
max_coeff = max(maximum(abs.(a1)),maximum(abs.(b1)))
tol = max_coeff * rtol + atol
while na1 >= 1
na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
@inbounds for i in na1:-1:nb1
s = -a1[i] / b1[nb1]
@inbounds for j in 1:nb1-1
a1[i-nb1+j] += b1[j] * s
end
end
na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
na1 === nothing && (na1 = 0)
a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1
end
For my own purposes to manipulate rational matrices, I implemented a version of gcd inspired by your code in divrem and an earlier implementation which I had in MATLAB. If you think it is helpful, you can include this code or part of it into the gcd routine in Polynomials. My code works only on the coefficients.
Am Do., 2. Juli 2020 um 13:53 Uhr schrieb john verzani < notifications@github.com>:
The gcd calculation bumps into floating point issues quite quickly. If you have some ideas how to modify the current algorithm (the truncate call perhaps?) why not try them out and put in a PR. I'm not sure how to do so. At one point I tried implementing an approximate gcd algorithm based on https://doi.org/10.1090/S0025-5718-04-01692-8, but that code has some bitrot now and doesn't seem to help here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-652960753, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHEBFRYICFT6AE2LVWW3RZRYK5ANCNFSM4OOYT3IA .
Here is a formatted version of previous comment:
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(real(T)),) where {T <: Number, S <: Number }
# Greatest common divisor of two polynomials.
# d = gcd1(a,b; atol = 0, rtol)
# Compute the greatest common divisor `d` of two polynomials `a` and `b` using Blankinship's algorithm.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
max_coeff = max(maximum(abs.(a1)),maximum(abs.(b1)))
tol = max_coeff * rtol + atol
while na1 >= 1
na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
@inbounds for i in na1:-1:nb1
s = -a1[i] / b1[nb1]
@inbounds for j in 1:nb1-1
a1[i-nb1+j] += b1[j] * s
end
end
na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
na1 === nothing && (na1 = 0)
a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1
end
For my own purposes to manipulate rational matrices, I implemented a
version ofgcd
inspired by your code in divrem
and an earlier
implementation which I had in MATLAB. If you think it is helpful, you can
include this code or part of it into thegcd
routine inPolynomials
. My
code works only on the coefficients.
Thanks! Let me have a look. At first glance it looks similar to using chop
in place of truncate
, which makes sense.
I implemented two new prototype GCD's, which produce better accuracy gcd's.
The first is just an enhanced Euclidian algorithms, using some internal scaling:
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
# Greatest common divisor of two polynomials.
# d = gcd1(a,b; atol = 0, rtol)
# Compute the greatest common divisor `d` of two polynomials `a` and `b` using
# the Euclidian Algorithm with scaling as of [1].
# [1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to ill-conditioned
# algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
a1 = a1/norm(a1)
b1 = b1/norm(b1)
atol == 0 ? tol = rtol : tol = atol
# determine the degree of GCD as the nullity of the Sylvester matrix
# this computation can be replaced by simply setting nd = 1, in which case the Sylvester matrix is not formed
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)])
while na1 > nd
na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
sc = 1
@inbounds for i in na1:-1:nb1
s = -a1[i] / b1[nb1]
sc = max(sc,abs(s))
@inbounds for j in 1:nb1-1
a1[i-nb1+j] += b1[j] * s
end
end
a1 = a1/sc
na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
na1 === nothing && (na1 = 0)
a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1
end
The second function is usually more robust and computes additionally the factorsv = a/d
and w = v/d
.
function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
# Greatest common divisor of two polynomials.
# gcd2(a,b; atol = 0, rtol) -> (d, v, w)
# Compute the greatest common divisor `d` of two polynomials `a` and `b`, and
# the polynomials v and w such that a = d * v and b = d * w, using a SVD-based method
# adapted from [1] (without iterative accuracy refinement).
# [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials,
# in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
# Randomization, Relaxation,and Complexity in Polynomial Equation Solving,
# Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
na1 <= 0 && (return ones(R,1), a1, b1)
nb1 <= 0 && (return ones(R,1), a1, b1)
atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol
# determine the degree of GCD as the nullity of the Sylvester matrix
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)], a1, b1)
# determine [w; -v] from an orthogonal/unitary nullspace basis of dimension 1
# of a reduced Sylvester matrix using the last row of Vt from its (partial) SVD
wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]
eltype(wv) <: Complex && (wv = conj(wv))
v = -wv[nb1-nd+1:end]
w = wv[1:nb1-nd]
# determine the GCD d with high accuracy as the solution of a well-conditioned
# linear least-squares problem
d = convmtx(w,nd+1)\b1
return d, v, w
end
Both functions call convmtx
to construct an appropriate convolution matrix (a similar function is also available in MATLAB).
function convmtx(v::AbstractVector{T}, n::Int) where T
# Convolution matrix.
# C = convmtx(v,n) returns the convolution matrix C for a vector v.
# If q is a column vector of length n, then C*q is the same as conv(v,q).
# Form C as the Toeplitz matrix
# C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code inline
nv = length(v)-1
C = zeros(T, n+nv, n)
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
return C
end
I performed a lot of tests with large order randomly generated examples and several examples from the literature.
I am using for my purposes gcd2,
which is more accurate and better suited for LCM computation.
Nice! I can package into a PR if you want or you can. Just let me know.
On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga notifications@github.com wrote:
I implemented two new prototype GCD's, which produce better accuracy gcd's.
The first is just an enhanced Euclidian algorithms, using some internal scaling:
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
# Greatest common divisor of two polynomials. # d = gcd1(a,b; atol = 0, rtol) # Compute the greatest common divisor `d` of two polynomials `a` and `b` using # the Euclidian Algorithm with scaling as of [1]. # [1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to ill-conditioned # algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991. na1 = findlast(!iszero,a) # degree(a) + 1 na1 === nothing && (na1 = 0) nb1 = findlast(!iszero,b) # degree(b) + 1 nb1 === nothing && (nb1 = 0) R = eltype(one(T)/one(S)) na1 <= 0 && (return ones(R,1)) nb1 <= 0 && (return ones(R,1)) a1, b1, = promote(a[1:na1], b[1:nb1], R[]) a1 = a1/norm(a1) b1 = b1/norm(b1) atol == 0 ? tol = rtol : tol = atol # determine the degree of GCD as the nullity of the Sylvester matrix # this computation can be replaced by simply setting nd = 1, in which case the Sylvester matrix is not formed nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol) nd == 0 && (return [one(R)]) while na1 > nd na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1)) sc = 1 @inbounds for i in na1:-1:nb1 s = -a1[i] / b1[nb1] sc = max(sc,abs(s)) @inbounds for j in 1:nb1-1 a1[i-nb1+j] += b1[j] * s end end a1 = a1/sc na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1]) na1 === nothing && (na1 = 0) a1 = a1[1:na1] end return nb1 == 1 ? [one(R)] : b1
end
The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.
function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
# Greatest common divisor of two polynomials. # gcd2(a,b; atol = 0, rtol) -> (d, v, w) # Compute the greatest common divisor `d` of two polynomials `a` and `b`, and # the polynomials v and w such that a = d * v and b = d * w, using a SVD-based method # adapted from [1] (without iterative accuracy refinement). # [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials, # in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.), # Randomization, Relaxation,and Complexity in Polynomial Equation Solving, # Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217. na1 = findlast(!iszero,a) # degree(a) + 1 na1 === nothing && (na1 = 0) nb1 = findlast(!iszero,b) # degree(b) + 1 nb1 === nothing && (nb1 = 0) R = eltype(one(T)/one(S)) a1, b1, = promote(a[1:na1], b[1:nb1], R[]) na1 <= 0 && (return ones(R,1), a1, b1) nb1 <= 0 && (return ones(R,1), a1, b1) atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol # determine the degree of GCD as the nullity of the Sylvester matrix nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol) nd == 0 && (return [one(R)], a1, b1) # determine [w; -v] from an orthogonal/unitary nullspace basis of dimension 1 # of a reduced Sylvester matrix using the last row of Vt from its (partial) SVD wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:] eltype(wv) <: Complex && (wv = conj(wv)) v = -wv[nb1-nd+1:end] w = wv[1:nb1-nd] # determine the GCD d with high accuracy as the solution of a well-conditioned # linear least-squares problem d = convmtx(w,nd+1)\b1 return d, v, w
end
Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).
function convmtx(v::AbstractVector{T}, n::Int) where T
# Convolution matrix. # C = convmtx(v,n) returns the convolution matrix C for a vector v. # If q is a column vector of length n, then C*q is the same as conv(v,q). # Form C as the Toeplitz matrix # C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code inline nv = length(v)-1 C = zeros(T, n+nv, n) @inbounds for j = 1:n C[j:j+nv,j] = v end return C
end
I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA .
-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu
I would appreciate if you can do it.
john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 13:01:
Nice! I can package into a PR if you want or you can. Just let me know.
On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga notifications@github.com wrote:
I implemented two new prototype GCD's, which produce better accuracy gcd's.
The first is just an enhanced Euclidian algorithms, using some internal scaling:
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
Greatest common divisor of two polynomials.
d = gcd1(a,b; atol = 0, rtol)
Compute the greatest common divisor
d
of two polynomialsa
andb
using
the Euclidian Algorithm with scaling as of [1].
[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to
ill-conditioned
algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
a1 = a1/norm(a1)
b1 = b1/norm(b1)
atol == 0 ? tol = rtol : tol = atol
determine the degree of GCD as the nullity of the Sylvester matrix
this computation can be replaced by simply setting nd = 1, in which
case the Sylvester matrix is not formed
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)])
while na1 > nd
na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
sc = 1
@inbounds for i in na1:-1:nb1
s = -a1[i] / b1[nb1]
sc = max(sc,abs(s))
@inbounds for j in 1:nb1-1
a1[i-nb1+j] += b1[j] * s
end
end
a1 = a1/sc
na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
na1 === nothing && (na1 = 0)
a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1
end
The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.
function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
Greatest common divisor of two polynomials.
gcd2(a,b; atol = 0, rtol) -> (d, v, w)
Compute the greatest common divisor
d
of two polynomialsa
and
b
, andthe polynomials v and w such that a = d v and b = d w, using a
SVD-based method
adapted from [1] (without iterative accuracy refinement).
[1] Z. Zeng, The numerical greatest common divisor of univariate
polynomials,
in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
Randomization, Relaxation,and Complexity in Polynomial Equation
Solving,
Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
na1 <= 0 && (return ones(R,1), a1, b1)
nb1 <= 0 && (return ones(R,1), a1, b1)
atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol
determine the degree of GCD as the nullity of the Sylvester matrix
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)], a1, b1)
determine [w; -v] from an orthogonal/unitary nullspace basis of
dimension 1
of a reduced Sylvester matrix using the last row of Vt from its
(partial) SVD
wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]
eltype(wv) <: Complex && (wv = conj(wv))
v = -wv[nb1-nd+1:end]
w = wv[1:nb1-nd]
determine the GCD d with high accuracy as the solution of a
well-conditioned
linear least-squares problem
d = convmtx(w,nd+1)\b1
return d, v, w
end
Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).
function convmtx(v::AbstractVector{T}, n::Int) where T
Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v.
If q is a column vector of length n, then C*q is the same as conv(v,q).
Form C as the Toeplitz matrix
C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code
inline
nv = length(v)-1
C = zeros(T, n+nv, n)
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
return C
end
I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA
.
-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658702675, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHEF5W5SH6MSD4IC2653R3WECFANCNFSM4OOYT3IA .
Do you have any thoughts on this naming scheme:
for gcd1 add a scale=true
keyword argument to gcd to call gcd1.
Maybe this should be scale=:noda_sasaki in case there are other scalings?
for gcd2 call this uvgcd
, following Zeng. (This returns more than
d, so it should have a different name)
Or,
uvgcd
and a keyword argument to gcd
to distinguish
returning the Euclidean algorithm, the EA with scaling; or the d from
zeng--J
On Wed, Jul 15, 2020 at 7:25 AM Andreas Varga notifications@github.com wrote:
I would appreciate if you can do it.
john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 13:01:
Nice! I can package into a PR if you want or you can. Just let me know.
On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga notifications@github.com wrote:
I implemented two new prototype GCD's, which produce better accuracy gcd's.
The first is just an enhanced Euclidian algorithms, using some internal scaling:
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
Greatest common divisor of two polynomials.
d = gcd1(a,b; atol = 0, rtol)
Compute the greatest common divisor
d
of two polynomialsa
and
b
usingthe Euclidian Algorithm with scaling as of [1].
[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to
ill-conditioned
algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
a1 = a1/norm(a1)
b1 = b1/norm(b1)
atol == 0 ? tol = rtol : tol = atol
determine the degree of GCD as the nullity of the Sylvester matrix
this computation can be replaced by simply setting nd = 1, in which
case the Sylvester matrix is not formed
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)])
while na1 > nd
na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
sc = 1
@inbounds for i in na1:-1:nb1
s = -a1[i] / b1[nb1]
sc = max(sc,abs(s))
@inbounds for j in 1:nb1-1
a1[i-nb1+j] += b1[j] * s
end
end
a1 = a1/sc
na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
na1 === nothing && (na1 = 0)
a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1
end
The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.
function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
Greatest common divisor of two polynomials.
gcd2(a,b; atol = 0, rtol) -> (d, v, w)
Compute the greatest common divisor
d
of two polynomialsa
and
b
, andthe polynomials v and w such that a = d v and b = d w, using a
SVD-based method
adapted from [1] (without iterative accuracy refinement).
[1] Z. Zeng, The numerical greatest common divisor of univariate
polynomials,
in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
Randomization, Relaxation,and Complexity in Polynomial Equation
Solving,
Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
na1 <= 0 && (return ones(R,1), a1, b1)
nb1 <= 0 && (return ones(R,1), a1, b1)
atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol
determine the degree of GCD as the nullity of the Sylvester matrix
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)], a1, b1)
determine [w; -v] from an orthogonal/unitary nullspace basis of
dimension 1
of a reduced Sylvester matrix using the last row of Vt from its
(partial) SVD
wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]
eltype(wv) <: Complex && (wv = conj(wv))
v = -wv[nb1-nd+1:end]
w = wv[1:nb1-nd]
determine the GCD d with high accuracy as the solution of a
well-conditioned
linear least-squares problem
d = convmtx(w,nd+1)\b1
return d, v, w
end
Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).
function convmtx(v::AbstractVector{T}, n::Int) where T
Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v.
If q is a column vector of length n, then C*q is the same as
conv(v,q).
Form C as the Toeplitz matrix
C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code
inline
nv = length(v)-1
C = zeros(T, n+nv, n)
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
return C
end
I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub <
https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384
, or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA
.
-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658702675 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALJDHEF5W5SH6MSD4IC2653R3WECFANCNFSM4OOYT3IA
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658712574, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADG6THKYONRZBEKVKN3E4LR3WGY3ANCNFSM4OOYT3IA .
-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu
I had this code in PolynomialZeros
that is related. It needed some updating for the newer Polynomials. It works a little harder to identify the degree when there are multiplicities. Would you mind kicking it around to see if it works for your use cases? If so, it would be nice to merge the two.
module AGCD
using Polynomials
using LinearAlgebra
#using SparseArrays # didn't get boost I thought we might with JF
## This provide AGCD.agcd for finding an *approximate* GCD of two polynomials. The most common use in this package
## is to reduce a polynomial `p` to a square free polynomial `q=p/gcd(p, p')`.
## This isn't too accurate for higher order polys, but
## seems to be okay now for degree 20 or less
## Can we do better?
## Follow up on these:
# * [Winkler&Hasan](http://ac.els-cdn.com/S0377042712003135/1-s2.0-S0377042712003135-main.pdf?_tid=126266d0-ea31-11e6-a1a7-00000aab0f6b&acdnat=1486140896_2119601f37cf58ff0d3eefbc05fecbb4)
# * [LiLiuZhi](http://www.sciencedirect.com/science/article/pii/S0377042707000271?np=y&npKey=f5e3b7a1bc9583214348aad066170ea389212ebba443454938cacfdbf96177f8)
# * [Winkler](http://ml.dcs.shef.ac.uk/summer_school/ln.pdf)
# * [Matonardi&...](http://www.sciencedirect.com/science/article/pii/S0024379515005649)
# * [FarmerPhD](http://www.learninglink.bbk.ac.uk/site/assets/files/1025/farmer.pdf)
# * [WinklerLaoHasan](http://www.sciencedirect.com/science/article/pii/S0377042712000751)
# * [Sagraloff&...](https://arxiv.org/pdf/1308.4088v2.pdf)
# * [](https://arxiv.org/pdf/1207.0630v3.pdf)
## ------------------------
# we use vectors, not polynomials. Some of this if copied and modified from Polynomials.jl
# originally by @vtjnash
_degree(p::Vector) = length(p) - 1
function _polyval(p::Vector{T}, x::S) where {T, S}
R = promote_type(T,S)
y = convert(R, p[end])
for i in lastindex(p)-1:-1:1
y = p[i] + x * y
end
y
end
function _monic!(ps::Vector{T}) where T
lambda = iszero(ps[end]) ? one(T) : 1 / ps[end]
ps .= ps .* lambda
end
function _monic(p::Vector)
p1 = copy(p)
_monic!(p1)
p1
end
function _polymul(p::Vector{T}, q) where {T}
R = promote_type(T,eltype(q))
n = length(p)-1
m = length(q)-1
a = zeros(R, m+n+1)
for i = 1:n+1
for j = 1:m+1
a[i+j-1] += p[i] * q[j]
end
end
a
end
function _polyder(p::Vector{T}) where {T}
S = eltype(float(p[1]))
n = length(p)
a2 = Vector{S}(undef, n-1)
for i = 1:n-1
a2[i] = p[i+1] * i
end
a2
end
## ------------------------
## Special matrices
## p = [p_0, p_1, ..., p_n]
function cauchy_matrix_size(p::Vector, k::Integer)
n = length(p)
(n + k - 1, k)
end
function cauchy_matrix!(M, p::Vector{T}, k::Integer) where {T}
n = length(p)
for i in 1:k
M[(1:n) .+ (i-1), i] = p
end
end
function cauchy_matrix(p::Vector{T}, k::Integer) where {T}
n,m = cauchy_matrix_size(p, k)
out = zeros(T, n, m)
cauchy_matrix!(out, p, k)
out
end
## Jacobian F(u,v,w) = [p,p'] is J(u,v,w)
function JF_size(u, v, w)
m, k, j = _degree(u), _degree(v), _degree(w)
n, l = m + k, m + j
ai, aj = cauchy_matrix_size(v, m + 1)
bi, bj = cauchy_matrix_size(u, k + 1)
di, dj = cauchy_matrix_size(w, m + 1)
fi, fj = cauchy_matrix_size(u, j + 1)
ci, cj = ai, fj
ei, ej = di, bj
(1 + ai + di, aj + bj + cj)
end
function JF!(M, u::Vector{T}, v, w) where {T}
m, k, j = _degree(u), _degree(v), _degree(w)
n, l = m + k, m + j
# JF_size should return these
ai, aj = cauchy_matrix_size(v, m + 1)
bi, bj = cauchy_matrix_size(u, k + 1)
di, dj = cauchy_matrix_size(w, m + 1)
fi, fj = cauchy_matrix_size(u, j + 1)
ci, cj = ai, fj
ei, ej = di, bj
M[1,1] = one(T)
cauchy_matrix!(view(M, 1 .+ (1:ai), 1:aj), v, m + 1)
cauchy_matrix!(view(M, 1 .+ (1:bi), aj .+ (1:bj)), u, k + 1)
# M[1 + (1:ci), aj + bj + (1:cj)] = zeros(T, n+1, m+1-j)
cauchy_matrix!(view(M, 1 + ai .+ (1:di), 1:dj), w, m + 1)
# M[1 + bi + (1:ei), dj + (1:ej)] = zeros(T, m + 1, n + 1 - j)
cauchy_matrix!(view(M, 1 + ci .+ (1:fi), dj + ej .+ (1:fj)), u, j + 1)
end
function JF(u::Vector{U}, v::Vector{V}, w::Vector{W}) where {U,V,W}
R = promote_type(U,V, W)
M = zeros(R, JF_size(u, v, w)...)
JF!(M, u, v, w)
M
end
# eqn (12) to get weights
function _weights(p::Vector{T},q) where {T}
n, m = length(p), length(q)
wts = ones(T, 1 + n + m)
for (i,pj) in enumerate(p)
wts[1+i] = min(one(real(T)), inv(abs(pj)))
end
for (i,qj) in enumerate(q)
wts[1 + n + i] = min(one(real(T)), inv(abs(qj)))
end
wts
end
# reduce residual by Gauss-Newton algorithm
# updates u,v,w; returns residual error and flag for convergence
function reduce_residual!(u,v,w, p::Vector{T}, q, wts, ρ) where {T}
m, n, l = map(_degree, (u, v, w))
# preallocate
#R = big(T)
R = T
A = zeros(R, JF_size(u, v, w)...)
b = zeros(R, 1 + length(p) + length(q))
inc = zeros(R, m + n + l + 3) #weighted_least_square(A, b, wts)
up,vp,wp = copy(u),copy(v),copy(w)
ρm, ρm_1 = one(T), residual_error(p,q,u,v,w, wts)
MAXSTEPS = 10
ctr = 0
flag = :not_converged
rho_0 = ρm_1
while ctr <= MAXSTEPS
ρm = agcd_update!(p, q, u,v,w, m, n, A, b, inc, up, vp, wp, wts, ρm_1)
if ρm < ρm_1
ctr += 1
ρm_1 = ρm
else
break
end
end
# threshold here is really important, but not clear what it should be.
# We empirically get better results---it seems---with sqrt(norm(p,2))
# though recomendation if norm(u,2); Might also make sense to include
# q in the computation
if ρm <= ρ * norm(u,2)
flag = :converged
end
if ρm < rho_0
end
return (ρm, flag)
end
## Lemma 4.1, (25)
## solve weighted least squares problem W*(Ax - b) = 0
function weighted_least_square(A, b, w)
W = diagm(0 => w)
(W * A) \ (W * b)
end
function weighted_least_square!(M, A, b, w)
W = diagm(0 => w)
M[:] .= (W * A) \ (W * b)
end
## compute F(u,v,w) - [p, p'] = [u*v, u*w] - [p, p']
Fmp(p,q,u,v,w) = [u[end]; _polymul(u,v); _polymul(u,w)] .- [1; p; q]
function Fmp!(b, p,q,u,v,w)
b[1] = u[end] - 1
offset = 1
for (i,v) in enumerate(_polymul(u,v) .- p)
b[offset + i] = v
end
offset = 1 + length(p)
for (i,v1) in enumerate(_polymul(u,w) .- q)
b[offset + i] = v1
end
end
residual_error(p,q,u,v,w, wts=ones(1 + length(p) + length(q))) = norm( ([_polymul(u,v); _polymul(u,w)] .- [p; q]) .* wts[2:end], 2)
### refine u,v,w estimate using
## Newton's method and weighted least squares
# updates in place u,v,w
function agcd_update!(p, q, u,v,w,m,n, A, b, inc, up, vp, wp, wts, err0)
JF!(A, u,v,w)
Fmp!(b, p,q,u,v,w)
weighted_least_square!(inc, A, b, wts)
Δu = inc[1:(1+m)]
Δv = inc[(m+2):(m+n+2)]
Δw = inc[(m+n+3):end]
up .= u; vp .= v; wp .= w
up .-= Δu; _monic!(up)
vp .-= Δv; _monic!(vp)
wp .-= Δw; _monic!(wp)
err = residual_error(p,q,up,vp,wp, wts)
# update if we improve
if err < err0
u .= up; v .= vp; w .= wp
else
err = err0
end
# return error estimate
err
end
# used below
function _mult(Gs, A)
for G in Gs
A = G * A
end
A
end
function qr_sylvester!(Gs, A0, k)
A = copy(A0)
m,n = size(A)
for i in m:-1:2
Gi, r = givens(A[1,1],A[i,1], 1, i)
A = Gi * A
push!(Gs, Gi) # [G1, G2, ..., Gn]
end
for i in m:-1:3
Gi, r = givens(A[2,2],A[i,2], 2, i)
A = Gi * A
push!(Gs, Gi)
end
return A[1:2,1:2]
end
function qr_sylvester!(Gs, A0, k, R)
T = eltype(A0)
N = (k-1)
Zs = zeros(T, N)
B = vcat(hcat(Zs, Zs), A0)
B = _mult(Gs, B)
m,n = size(B)
for i in m:-1:(2k)
Gi, r = givens(B[2(k-1)+1, 1], B[i, 1], 2(k-1)+1, i)
B = Gi * B
push!(Gs, Gi)
end
for i in m:-1:(2k+1)
Gi, r = givens(B[2k, 2], B[i, 2], 2k, i)
B = Gi * B
push!(Gs, Gi)
end
return [vcat(R, zeros(T, 2, 2*(k-1))) B[1:(2k),:]]
end
# return sigma, x
# converge on smallest eigenvalue of (A'*A) using power rule
# A=QR; (A'*A)^-1) = (R'*Q'*Q*R)^(-1) = (R'*R)^(-1)
# instead of computing x_{i+1} = (R'*R)^{-1} xi; we solve R'y=xi; Rz=y;x=z/|z|
function smallest_eigval(R::LinearAlgebra.UpperTriangular{T},
thresh=sqrt(eps(real(T)))) where {T}
k = size(R)[1]/2
if iszero(det(R))
return (:iszero, zero(T), T[])
end
m,n = size(R)
x = ones(T, n)
y = zeros(T, m)
z = zeros(T, n)
σ, σ1 = Inf*one(real(T)), Inf*one(real(T))
flag = :ispositive
for cnt in 1:10
y .= R' \ x
z .= R \ y
nz = 1/norm(z,2)
x .= z .* nz
sigma = norm(R * x, 2)
σ1 = abs(sigma)
if σ1 < 0.99 * σ
σ = σ1
continue
end
if σ1 < thresh
flag = :ispossible
break
end
if abs(σ - σ1) < 1.1 * σ1
flag = :ispossible
break
end
σ = σ1
end
return (flag, σ1, x)
end
## return k, sigma for possible k values
function sylvester_matrix_singular_values(p::Polynomial, q::Polynomial=polyder(p);kwargs...)
sylvester_matrix_singular_values(Polynomials.coeffs(p), Polynomials.coeffs(q); kwargs...)
end
function sylvester_matrix_singular_values(ps::Vector{T},
qs::Vector{S}=_polyder(ps),
θ=sqrt(eps(real(T)))) where {T,S}
_monic!(ps); _monic!(qs)
n, m = length(ps), length(qs)
wts = _weights(ps,qs)
R = promote_type(T,S)
A0::Matrix{R} = R[ps vcat(zeros(T, n-m),qs)]
Gs = LinearAlgebra.Givens{T}[]
k = 1
sigmas = R[]
R = qr_sylvester!(Gs, A0, k)
local x::Vector{T}
while k <= m-1
V = UpperTriangular(R)
flag, sigma, x = smallest_eigval(V, θ * norm(ps,2))
push!(sigmas, sigma)
k += 1
R = qr_sylvester!(Gs, A0, k, R)
end
return [1:(k-1) sigmas]
end
# return R for sylvester matrix of size k
function sylvester_matrix_k(ps::Vector{T},
K) where {T}
qs = _polyder(ps)
_monic!(ps); _monic!(qs)
n, m = length(ps), length(qs)
wts = _weights(ps,qs)
A0::Matrix{T} = [ps vcat(zeros(T, n-m),qs)]
Gs = LinearAlgebra.Givens{T}[]
R = qr_sylvester!(Gs, A0, 1)
k = 2
while k <= K
R = qr_sylvester!(Gs, A0, k, R)
k += 1
end
V = UpperTriangular(R)
end
"""
agcd(p, q, θ=sqrt(eps(real(T))), ρ=cbrt(eps(real(T))) * θ)
Find an approximate GCD for polynomials `p` and `q` using an algorithm of [Zeng](https://doi.org/10.1090/S0025-5718-04-01692-8).
Returns u,v,w, err where:
* `u*v ≈ monic(p)`
* `u*w ≈ monic(q)`
* The total residual error in these approximations is bounded by `err`.
Further,
* `v` and `w` should share no common roots (`u` is a gcd of `u*v` and `u*w`)
* the roots of `v` should exhaust the unique values of roots of `p`.
(If `p` and `q` are specified as vectors, the returned values will be
vectors. If specified as objects of `Polynomial` type then the
returned values will be as well.)
The tolerances are:
* θ: singular value threshold. Used to identify if smallest singular value of Sylvester matrix is ≈ 0. We use `sqrt(eps(T))`, close to that of Zeng's `1e-8`, but more flexible should `BigFloat` values be used.
* ρ: initial residual tolerance. If we can get (u,v,w) error less than this, we stop. Zeng uses `1e-10`, we use as a default the smaller `eps(T)^(5/6)`.
The algorithm looks for the first `k` for which the corresponding
Sylvester matrix is rank deficient. This follows Lemma 2.4 of the paper, which
finds the smallest singular value. In the process, an estimate for
`u`,`v`, and `w` is produced. If this singular value is approximately
0 (as determined with the parameter θ) *and* the estimates of the
decomposition can be refined via a Gauss-Jordan process to a close
enough approximation (as determined with the parameter ρ) then the
desired values are found.
The act of identifying the value of `k` depends on determining if the
smallest singular value is 0 which in practice is quite hard. The
simple threshold of `norm(p,2) ⋅ θ` will fail for higher-degree
polynomials. For manually identifying this value, the complete set of
estimated singular values for different values of `k` are returned by
`sylvester_matrix_singular_values`. For a given `k` the refined values
of `u`, `v`, and `w` are returned by `rank_k_agcd`.
"""
function agcd(ps::Vector{T}, qs::Vector{S}=_polyder(ps);
θ=sqrt(eps(real(T))),
ρ=cbrt(eps(real(T))) * θ,
maxk=length(qs)) where {T,S}
_monic!(ps); _monic!(qs)
n, m = length(ps), length(qs)
wts = _weights(ps,qs)
U = promote_type(T,S)
A0::Matrix{U} = U[ps vcat(zeros(U, n-m),qs)]
nm = norm(ps, 2)
thresh = nm * θ ## this is sensitive
Gs = LinearAlgebra.Givens{T}[]
k = 1
R = qr_sylvester!(Gs, A0, k)
local x::Vector{U}
while k <= maxk
V = UpperTriangular(R)
flag, sigma, x = smallest_eigval(V, thresh)
if flag == :iszero
# exact zero means we need to identify initial guess for u,v,w
u = qs
w = ones(U, 1)
v = cauchy_matrix(u, length(ps) - m + 1) \ ps
_monic!(v)
return (u,v,w, zero(U))
end
if flag != :iszero && sigma < thresh
v = x[2:2:end]; _monic!(v)
w = x[(1+2(n-m)):2:end]; _monic!(w)
A = cauchy_matrix(v, n - length(v) + 1)
u = A \ ps
_monic!(u)
ρm, flag = reduce_residual!(u, v, w, ps, qs, wts, ρ)
if flag == :converged || flag == :not_updated
return (u, v, w, ρm)
end
end
# else bumpup k
k += 1
if k > m
u = ones(U,1)
w = qs
v = ps
return (u, v, w, zero(U))
end
R = qr_sylvester!(Gs, A0, k, R)
end
# Okay, we gave it a good shot, but didn't find a perfect candidate
# let's move on...
v = x[2:2:end]; _monic!(v)
w = x[3:2:end]; _monic!(w)
A = cauchy_matrix(v, n - length(v) + 1)
u = A \ ps
_monic!(u)
wts = _weights(ps,qs)
ρm, flag = reduce_residual!(u,v,w, ps, qs, wts, ρ)
return (u, v, w, ρm)
end
function agcd(p::Polynomial, q::Polynomial=polyder(p); kwargs...)
u,v,w,rho = agcd(Polynomials.coeffs(p), Polynomials.coeffs(q); kwargs...)
(Polynomial(u), Polynomial(v), Polynomial(w), rho)
end
## --------------
## preconditioning code
## taken from https://who.rocq.inria.fr/Jan.Elias/pdf/je_masterthesis.pdf
## compute q(x) = p(phi*x)
function Tphi!(p, ps::Vector{T}, phi) where {T}
p .= ps .* (phi^(i-1) for i in eachindex(ps))
end
function Tphi(ps::Vector{T}, phi) where {T}
p = copy(ps)
p .= ps .* (phi^(i-1) for i in eachindex(ps))
p
end
## c
## compute q(x) = p(x-alpha)
function T_alpha(ps::Vector{T}, alpha) where {T}
end
function geometric_mean(a::Vector{T})::T where {T}
b = filter(!iszero, a)
n = length(b)
prod(abs(u)^(one(T)/n) for u in b)
end
# compute max(max(ps), max(qs)) / min(min(ps), min(qs))
function ratio(p::Vector{T}, q::Vector{T}) where {T}
m,M = Inf*one(real(T)), zero(real(T))
for ps in (p, q)
for a in ps
aa = abs(a)
if 0 < aa < m
m = aa
end
if aa > M
M = aa
end
end
end
iszero(m) ? Inf*one(T) : M/m
end
## find alpha, gamma that minimize ratio of max coefficients to min coefficients, for getting zeros
## 1.12 writes this as a linear programming problem, we just ...
function precondition(p::Vector{T}, q::Vector) where {T}
S = real(T)
m = ratio(p,q)
alphas = ((2*one(S))^i for i in -5:5)
phis = ((2*one(S))^i for i in -5:5)
out = (one(S), one(S))
for α in alphas
for ϕ in phis
r = ratio(Tphi(p, ϕ), α * Tphi(q, ϕ))
if r < m
out = (α, ϕ)
end
end
end
α, ϕ = out
p = Tphi(p, ϕ)
q = α * Tphi(q, ϕ)
p = p * (1/geometric_mean(p))
q = q * (1/geometric_mean(q))
p, q, ϕ, α
end
end
For gcd2 the name could be gcdvw. In the meantime I implemented the iterative refinement too, but without structure exploiting. This is now a separate function called gcdupd and performs a specified maximum number of iteration (until a residual decreses). It would be possible to include it in gcd2.
As for gcd1, this function occasionally produces poor accuracy results. This is because the long division algorithm for polynomials is not numerically stable. I don't understand the effect of scaling. Other scaling schemes could be also used.
john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 14:22:
Do you have any thoughts on this naming scheme:
for gcd1 add a
scale=true
keyword argument to gcd to call gcd1. Maybe this should be scale=:noda_sasaki in case there are other scalings?for gcd2 call this
uvgcd
, following Zeng. (This returns more than d, so it should have a different name)Or,
- Maybe have a
uvgcd
and a keyword argument togcd
to distinguish returning the Euclidean algorithm, the EA with scaling; or the d from zeng--J
On Wed, Jul 15, 2020 at 7:25 AM Andreas Varga notifications@github.com wrote:
I would appreciate if you can do it.
john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 13:01:
Nice! I can package into a PR if you want or you can. Just let me know.
On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga < notifications@github.com> wrote:
I implemented two new prototype GCD's, which produce better accuracy gcd's.
The first is just an enhanced Euclidian algorithms, using some internal scaling:
function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real
Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
Greatest common divisor of two polynomials.
d = gcd1(a,b; atol = 0, rtol)
Compute the greatest common divisor
d
of two polynomialsa
and
b
usingthe Euclidian Algorithm with scaling as of [1].
[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application
to ill-conditioned
algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
a1 = a1/norm(a1)
b1 = b1/norm(b1)
atol == 0 ? tol = rtol : tol = atol
determine the degree of GCD as the nullity of the Sylvester matrix
this computation can be replaced by simply setting nd = 1, in which
case the Sylvester matrix is not formed
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)])
while na1 > nd
na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
sc = 1
@inbounds for i in na1:-1:nb1
s = -a1[i] / b1[nb1]
sc = max(sc,abs(s))
@inbounds for j in 1:nb1-1
a1[i-nb1+j] += b1[j] * s
end
end
a1 = a1/sc
na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
na1 === nothing && (na1 = 0)
a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1
end
The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.
function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real
Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
Greatest common divisor of two polynomials.
gcd2(a,b; atol = 0, rtol) -> (d, v, w)
Compute the greatest common divisor
d
of two polynomialsa
and
b
, andthe polynomials v and w such that a = d v and b = d w, using a
SVD-based method
adapted from [1] (without iterative accuracy refinement).
[1] Z. Zeng, The numerical greatest common divisor of univariate
polynomials,
in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
Randomization, Relaxation,and Complexity in Polynomial Equation
Solving,
Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.
na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)
R = eltype(one(T)/one(S))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
na1 <= 0 && (return ones(R,1), a1, b1)
nb1 <= 0 && (return ones(R,1), a1, b1)
atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol
determine the degree of GCD as the nullity of the Sylvester matrix
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (return [one(R)], a1, b1)
determine [w; -v] from an orthogonal/unitary nullspace basis of
dimension 1
of a reduced Sylvester matrix using the last row of Vt from its
(partial) SVD
wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]
eltype(wv) <: Complex && (wv = conj(wv))
v = -wv[nb1-nd+1:end]
w = wv[1:nb1-nd]
determine the GCD d with high accuracy as the solution of a
well-conditioned
linear least-squares problem
d = convmtx(w,nd+1)\b1
return d, v, w
end
Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).
function convmtx(v::AbstractVector{T}, n::Int) where T
Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v.
If q is a column vector of length n, then C*q is the same as
conv(v,q).
Form C as the Toeplitz matrix
C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code
inline
nv = length(v)-1
C = zeros(T, n+nv, n)
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
return C
end
I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub <
https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384
, or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA
.
-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <
https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658702675
, or unsubscribe <
https://github.com/notifications/unsubscribe-auth/ALJDHEF5W5SH6MSD4IC2653R3WECFANCNFSM4OOYT3IA
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658712574 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AADG6THKYONRZBEKVKN3E4LR3WGY3ANCNFSM4OOYT3IA
.
-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658735621, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHECY3S3JU3XFTFT6E3LR3WNR3ANCNFSM4OOYT3IA .
Will you incorporate the work into gcd2? If so, I can wait on the PR I was working on.
I propose you go ahead with the PR. I still have a small change to include in gcd2 as proposed in the cited paper, namely, to include both v and w in the computation of the common divisor. I will send you the modified single line tomorrow.
john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 20:48:
Will you incorporate the work into gcd2? If so, I can wait on the PR I was working on.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658939502, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHED4BFFTGOIQFNNSYWLR3X2WHANCNFSM4OOYT3IA .
The old line in gcd2 is commentd out:
#d = convmtx(w,nd+1)\b1
d = [convmtx(v,nd+1) ; convmtx(w,nd+1)] \ [a1; b1]
An issue on which I am still reflecting is related to scaling of the original polynomials. The code works if we replace a
by a/sa
and b
with b/sb,
wheresa = norm(a)
and sb = norm(b).
Such a scaling is meaningfull to compensate big differences in the range of coefficients of a
and b
. The resulting v
and w
are then scaled factors of a
and b
, respectively, with the nice property norm([v;w]) = 1
. I wonder if it would be wise to include scaling and to provide v
and w
as now, instead ofv*sa
and w*sb
, respectively.
I implemented a version with scaling and optional iterative refinement (still no explicit structure exploition). Here is the whole collection:
function poldeg1(v::Vector{T}) where {T <: Number}
# Degree of a polynomial plus one.
n1 = findlast(!iszero,v) # degree(v) + 1
n1 === nothing && (n1 = 0)
return n1
end
poldeg(v::Vector{T}) where {T <: Number} = poldeg1(v)-1 # Degree of a polynomial
function convmtx(v::Vector{T}, n::Int) where {T <: Number}
# Convolution matrix.
# C = convmtx(v,n) returns the convolution matrix C for a vector v.
# If q is a column vector of length n, then C*q is the same as conv(v,q).
# Form C as the Toeplitz matrix
# C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code inline
nv = length(v)-1
C = zeros(T, n+nv, n)
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
return C
end
function polgcdvw(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T))), maxnit::Int = 0) where {T <: Number, S <: Number }
# Greatest common divisor of two polynomials.
# polgcdvw(a, b; atol = 0, rtol, maxnit = 0) -> (d, v, w, δ)
# Compute the greatest common divisor `d` of two polynomials `a` and `b`, and
# the polynomials v and w such that a = d * v * ka and b = d * w *kb, where ka and kb
# are scalar scaling factors (ka and kb are not explicitly provided).
# The accuracy estimate δ is computed as δ = norm([a/ka-d*v; b/kb-d*w]).
# A SVD-based method adapted from [1] is employed and an iterative accuracy refinement
# of δ, with a maximum number of maxnit iterations, is additionally performed.
# [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials,
# in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
# Randomization, Relaxation,and Complexity in Polynomial Equation Solving,
# Contemporary Mathematics,vol.556,AMS,2011,pp.187–217.
na1 = poldeg1(a) # degree(a) + 1
nb1 = poldeg1(b) # degree(b) + 1
R = eltype(one(T)/one(S))
a1, b1, = promote(a[1:na1], b[1:nb1], R[])
na1 <= 0 && (return ones(R,1), zeros(R,1), b1, zero(R))
nb1 <= 0 && (return ones(R,1), a1, zeros(R,1), zero(R))
a1 = a1/norm(a1)
b1 = b1/norm(b1)
switch = (na1 < nb1)
switch && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
atol == 0 ? tol = rtol : tol = atol
# determine the degree of GCD as the nullity of the Sylvester matrix
nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
nd == 0 && (switch ? (return [one(R)], b1, a1, zero(R)) : (return [one(R)], a1, b1, zero(R)))
# determine [w; -v] from an orthogonal/unitary nullspace basis of dimension 1
# of a reduced Sylvester matrix using the last row of Vt from its (partial) SVD
wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]
eltype(wv) <: Complex && (wv = conj(wv))
v = -wv[nb1-nd+1:end]
w = wv[1:nb1-nd]
# determine the GCD d with high accuracy as the solution of a well-conditioned
# linear least-squares problem
#d = convmtx(w,nd+1)\b1
d = [convmtx(v,nd+1) ; convmtx(w,nd+1)] \ [a1; b1]
if maxnit > 0
d, v, w, δ = gcdvwupd(a1, b1, d, v, w, maxnit = maxnit)
else
δ = norm( [a1; b1] - [ conv(v,d); conv(w,d) ])
end
switch ? (return d, w, v, δ) : (return d, v, w, δ)
end
function gcdvwupd(a::Vector{T}, b::Vector{T}, d::Vector{T}, v::Vector{T}, w::Vector{T}; maxnit::Int = 10) where {T <: Number}
# Iterative refinement of the accuracy of the greatest common divisor of two polynomials.
# gcdvwupd(a, b, d, v, w; maxnit = 0) -> (dupd, vupd, wupd, δ)
# Given the polynomials a and b, and a triple of polynomials (d, v, w) such that
# `d` is an approximation of a greatest common divisor of `a` and `b`, and
# v and w are approximate quotients of a/d and b/d, respectively, compute the updated triple
# (dupd, vupd, wupd) which minimizes the accuracy estimate δ = norm([a-dupd*vupd; b-dupd*wupd]).
# A maximum number of maxnit iterations are performed using the Gauss-Newton iteration method,
# in the form proposed in [1].
#
# [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials,
# in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
# Randomization, Relaxation,and Complexity in Polynomial Equation Solving,
# Contemporary Mathematics,vol.556,AMS,2011,pp.187–217.
na1 = length(a)
nb1 = length(b)
nd1 = length(d)
nv1 = length(v)
nw1 = length(w)
na1 == nd1+nv1-1 || error("Incompatible dimensions between a, d and v")
nb1 == nd1+nw1-1 || error("Incompatible dimensions between b, d and w")
h = d/norm(d)^2
f = [one(T);a;b]
fh = [dot(h,d); conv(v,d); conv(w,d)]
Δf = f-fh;
δ = norm(Δf)
d0 = copy(d); v0 = copy(v); w0 = copy(w);
nvw1 = nv1+nw1
ndvw1 = nd1+nvw1
nab1 = na1+nb1
Jh = [h' zeros(1,nvw1); zeros(nab1,ndvw1)]
j1 = 1:nd1
j2 = nd1+1:nd1+nv1
j3 = nd1+nv1+1:ndvw1
i2 = 2:na1+1
i3 = na1+2:nab1+1
Jh21 = view(Jh,i2,j1)
Jh31 = view(Jh,i3,j1)
Jh22 = view(Jh,i2,j2)
Jh33 = view(Jh,i3,j3)
for it = 1:maxnit
# Jh = [h' zeros(1,na1-nd1+1) zeros(1,nb1-nd1+1)
# convmtx(v0,nd1) convmtx(u0,na1-nd1+1) zeros(nd1+nv1-1,nb1-nd1+1)
# convmtx(w0,nd1) zeros(nd1+nw1-1,na1-nd1+1) convmtx(u0,nb1-nd1+1)];
Jh21[:,:] = convmtx(v0,nd1)
Jh31[:,:] = convmtx(w0,nd1)
Jh22[:,:] = convmtx(d0,nv1)
Jh33[:,:] = convmtx(d0,nw1)
Δz = Jh\Δf;
d1 = d0 + Δz[j1];
# v1 = v0 + Δz[nd1+1:nd1+nv1];
# w1 = w0 + Δz[end-nw1+1:end];
v1 = v0 + Δz[j2];
w1 = w0 + Δz[j3];
fh1 = [dot(h,d1); conv(v1,d1); conv(w1,d1)];
Δf1 = f-fh1;
δ1 = norm(Δf1)
if δ1 < δ
d0 = copy(d1); v0 = copy(v1); w0 = copy(w1); δ = copy(δ1); Δf = copy(Δf1);
else
break
end
end
return d0, v0, w0, δ
end
function pollcm(a::Vector{T}, b::Vector{S}; fastgcd::Bool = true, atol::Real = 0, rtol::Real = Base.rtoldefault(real(T)), maxnit::Int = 10) where {T <: Number, S <: Number }
# Least common multiple of two polynomials.
# m = pollcm(a,b; atol = 0, rtol, maxnit = 0)
# Compute the least common multiple `m` of two polynomials `a` and `b`. Employ iterative refinement of
# the accuracy of GCD(a,b) if maxnit > 0.
na1 = poldeg1(a) # degree(a) + 1
nb1 = poldeg1(b) # degree(b) + 1
na1 == 0 && (return zeros(T,1))
nb1 == 0 && (return zeros(S,1))
na1 == 1 && (return b)
nb1 == 1 && (return a)
return conv(a, polgcdvw(a, b, atol = atol, rtol = rtol, maxnit = maxnit)[3])
end
function conv(a::Vector{T}, b::Vector{S}) where {T <: Number, S <: Number }
# Convolution of two vectors (or product of two polynomials).
# c = conv(a,b) returns the convolution of vectors a and b.
na1 = length(a)
nb1 = length(b)
R = promote_type(T, S)
c = zeros(R, na1 + nb1 - 1)
for i in 1:na1, j in 1:nb1
@inbounds c[i + j - 1] += a[i] * b[j]
end
return c
end
I incorporated the above into the following, which tracks the paper's algorithm. It uses some heuristics for the tolerances, that work out in test cases up to degree ~ 20 (i.e. the gcd of p = (x-1)^n(x-2)^n(x-3); q = (x-1)(x-2)(x-4) is misidentified when n=12). Can you see if this works for your cases or you can offer improvements? Thanks.
module NGCD
using LinearAlgebra
"""
ngcd(ps::Vector{T}, qs::Vector{T}, ϵ=eps(T))
Return u,v,w, ρ, κ where `u⋅v ≈ ps` and `u⋅w ≈ qs`; ρ is the residual error (`|| [u⋅v,u⋅w] - [ps,qs] ||`); and `κ` a numerical gcd condition number estimate.
The numerical GCD problem as defined in [1]:
(5.4) For (p,q), polynomials of degree m,n, and ϵ > 0, a numerical GCD of (p,q) within ϵ is an exact GCD of (p̃,q̃) (polynomials of degree m,n with gcd degree j) where j is the largest value with θⱼ(p,q) < ϵ (θⱼ is the smallest distance such a pair is to (p,q)). These form an equivalence class, gcdₑ.
For any (p̂,q̂) with exact gcd of degree j, there is ϵ > 0 such that gcdₑ(p,q) satisfies
limsup_{(p,q)→(p̂,q̂)} inf{ || (u,v,w) - (û,v̂,ŵ) ||} / || (p,q) - (p̂,q̂) || < κₑ(p,q).
κ is called the numerical GCD condition number.
If an exact gcd exists for (p,q) then this will be in gcdₑ.
The Zeng algorithm proposes a degree for `u` and *if* a triple `(u,v,w)` can be found satisfying the inequality this is returned; otherwise the proposed degree is reduced and the process repeats. If not terminated, at degree 0 a constant gcd is returned.
The initial proposed degree is the first `j` with `j=min(m,n):-1:1` where Sⱼ is believed to have a singular value of 0 (Sⱼ being related to the Sylvester matrix of `ps` and `qs`). Verification of the degree is done using a Gauss-Newton iteration scheme holding the degree of `u` constant.
Reference:
[1] The Numerical Greatest Common Divisor of Univariate Polynomials
by Zhonggang Zeng;
[url](http://homepages.neiu.edu/~zzeng/uvgcd.pdf);
[doi](https://doi.org/10.1090/conm/556/11014)
Note: Based on work by Andreas Varga
"""
function ngcd(ps::Vector{T},
qs::Vector{T},
ϵ = eps(real(T)) # backwards error
) where {T <: AbstractFloat}
m, n = length(ps)-1, length(qs)-1 # degrees
if m < n
out = ngcd(qs, ps, ϵ)
return (u=out.u, v=out.w, w=out.v, out.ρ, out.κ)
end
# storage
A0 = zeros(T, m+1, 2)
A0[:,1] = ps
A0[end-n:end,2] = qs
# pre-allocate storage for Sylveser Matrices, S₁, S₂...
Q = zeros(T, m + n, m + n)
R = zeros(T, m + n, m + n)
Sₓ = hcat(convmtx(ps,1), convmtx(qs, m-n+1))
local x::Vector{T}
## This is a heuristic
ϵ *= (n+m)
ρ′ = 2(n * m) * (1 + norm([ps,qs])) * ϵ
j = n # We count down Sn, S_{n-1}, ..., S₂, S₁
F = qr(Sₓ)
nr, nc = size(Sₓ) # m+1, m-n+2
Q[1:nr, 1:nr] = F.Q
R[1:nc, 1:nc] = F.R
while true
ϵ′ = ϵ * sqrt(1 + m - j)
V = view(R, 1:nc, 1:nc)
flag, σ, x = smallest_singular_value(V, ϵ′)
if (flag == :iszero || abs(σ) <= ϵ′)
u, v, w = initial_uvw(Val(flag), j, ps, qs, x)
flag == :iszero && return (u=u, v=v, z=w, ρ=zero(U), κ=NaN)
# p₁ ≈ p₂ uses max(norm(p₁), norm(p₂)) to compare
# We are comparing [u⋅v; u⋅w] ≈ [p; q]
# so we treat ρ′ as a *relative* tolerance (unlike paper)
# and scale (above) by 2(n+m) * || [p;q] || (number of +, * in products; norm [p;q]
ρ₁, σ₂ = refine_agcd!(u,v,w, ps, qs, ρ′)
if ρ₁ <= ρ′
return (u=u, v=v, w=w, ρ=ρ₁, κ=σ₂) # (u,v,w) verified
end
end
# reduce possible degree of u and try again with Sⱼ₋₁
j -= 1
nr += 1
nc += 2
nc > nr && break
extend_QR!(Q,R, nr, nc, A0) # before Q⋅R = Sⱼ, now Q⋅R = Sⱼ₋₁
end
# u is a constant
u, v, w = initial_uvw(Val(:constant), j, ps, qs, x)
ρ₁, σ₂ = refine_agcd!(u,v,w, ps, qs, ϵ*sqrt(m))
return (u=u, v=v, w=w, ρ=ρ₁, κ=1/σ₂)
end
ngcd(ps::Vector{T}, qs::Vector{S}, ϵ=eps()) where {T, S} = ngcd(promote(float.(ps), float(qs))..., ϵ)
# return guess at smallest singular value and right sinuglar value, x
# for a right triangular matrix, V
function smallest_singular_value(V::AbstractArray{T,2},
θ = sqrt(eps(real(T)))) where {T}
R = UpperTriangular(V)
k = size(R)[1]/2
if iszero(det(R)) #
return (:iszero, zero(T), T[])
end
m,n = size(R)
x = ones(T, n)
y = zeros(T, m)
σ₀ = σ1 = Inf*one(real(T))
steps, minᵢ = 1, 5
while true
y .= R' \ x # use iteration to converge on singular value
x .= R \ y
x ./= norm(x,2)
σ₁ = norm(R * x, 2)
if (σ₁ > θ)
if (steps <= 50) && (steps <= minᵢ || σ₁ <= 0.95*σ₀) # decreasing, keep going
σ₀ = σ₁
else
return (:constant, σ₁, x)
end
else
return (:ispossible, σ₁, x)
end
steps += 1
end
end
# extend QR to next size
# Q gets a 1 in nc,nc, 0s should be elswhere
function extend_QR!(Q,R, nr, nc, A0)
#old Q is m x m, old R is n x n; we add to these
n = nc-2
m = nr - 1
k,l = size(A0)
# add two columns to R
# need to apply Q to top part of new columns
R[nr-k+1:nr, (nc-1):nc] = A0
R[1:nr-1, (nc-1):nc] = (view(Q, 1:nr-1, 1:nr-1))' * R[1:nr-1, (nc-1):nc]
# extend Q with row and column with identity
Q[nr,nr] = 1
# Make R upper triangular using Givens rotations
for j in nr-1:-1:nc-1
gj,_ = givens(R[j,nc-1], R[j+1,nc-1], j, j+1)
rmul!(Q, gj')
lmul!(gj, R)
end
for j in nr-1:-1:nc
gj,_ = givens(R[j,nc], R[j+1,nc], j, j+1)
rmul!(Q, gj')
lmul!(gj, R)
end
return nothing
end
## --------------------------------------------------
## Refine u,v,w
## Find u₀,v₀,w₀ from right singular vector
function initial_uvw(::Val{:ispossible}, j, ps, qs, x)
# Sk*[v;w] = 0, so pick out v,w after appliying permuation
m,n = length(ps)-1, length(qs)-1
vᵢ = vcat(2:m-n+2, m-n+4:2:length(x))
wᵢ = vcat(1, (m-n+3):2:length(x))
v = x[vᵢ]/x[end]
w = x[wᵢ]/x[end-1]
# p194 3.9 C_k(v) u = p or Ck(w) u = q; this uses 10.2
u = [convmtx(v,j+1); convmtx(w, j+1)] \ [ps; qs]
return u, v, w
end
function initial_uvw(::Val{:iszero}, j, ps, qs, x)
n,m = length(ps), length(qs)
# exact zero means we need to identify initial guess for u,v,w
u = qs
w = ones(T, 1)
v = convmtx(u, n - m + 1) \ ps
return u,v,w
end
function initial_uvw(::Val{:constant}, j, ps::Vector{T}, qs, x) where {T}
u = ones(T,1)
w = qs
v = ps
u,v,w
end
# find estimate for σ₂, used in a condition number
function σ₂(J)
F = qr(J)
flag, σ, x = smallest_singular_value(F.R)
1/σ
end
## attempt to refine u,v,w
function refine_agcd!(u::Vector{T}, v, w, p, q, ρ=eps(T), minᵢ=10) where {T}
m, n, l = _degree.((u, v, w))
ρₒ, ρ₁ = one(T), residual_error(p, q, u, v, w)
ρ₁ <= ρ && return ρ₁, NaN
# storage
A = zeros(T, JF_size(u, v, w)...)
b = zeros(T, 1 + length(p) + length(q))
Δf = zeros(T, m + n + l + 3)
steps = 0
u′, v′, w′ = copy(u), copy(v), copy(w)
h, β = u, norm(u)^2
κ = NaN
while true
# update A, b, then solve A\b
JF!(A, h, u, v, w)
Fmp!(b, h, β, p, q, u, v, w)
Δf .= A \ b
Δu, Δv, Δw = Δf[1:(m+1)], Δf[(m+2):(m+n+2)], Δf[(m+n+3):end]
ρ₀, ρ₁ = ρ₁, residual_error(p, q, u-Δu, v-Δv, w-Δw)
if (steps <= 50) && (ρ₁ >= ρ) && (steps <= minᵢ || ρ₁ < 0.05 * ρ₀)
u .-= Δu; v .-= Δv; w .-= Δw
steps += 1
else
F = qr(A)
flag, σ₂, _ = smallest_singular_value(F.R)
κ = 1/σ₂
return ρ₁, κ
end
end
end
## Jacobian F(u,v,w) = [p,p'] is J(u,v,w)
function JF_size(u, v, w)
m, k, j = _degree(u), _degree(v), _degree(w)
n, l = m + k, m + j
ai, aj = convmtx_size(v, m + 1)
bi, bj = convmtx_size(u, k + 1)
di, dj = convmtx_size(w, m + 1)
fi, fj = convmtx_size(u, j + 1)
ci, cj = ai, fj
ei, ej = di, bj
(1 + ai + di, aj + bj + cj)
end
# Jacobian matrix
function JF(u::Vector{U}, v::Vector{V}, w::Vector{W}) where {U,V,W}
R = promote_type(U,V, W)
M = zeros(R, JF_size(u, v, w)...)
JF!(M, u, v, w)
M
end
function JF!(M, h, u::Vector{T}, v, w) where {T}
m, k, j = _degree(u), _degree(v), _degree(w)
n, l = m + k, m + j
# JF_size should return these
ai, aj = convmtx_size(v, m + 1)
bi, bj = convmtx_size(u, k + 1)
di, dj = convmtx_size(w, m + 1)
fi, fj = convmtx_size(u, j + 1)
ci, cj = ai, fj
ei, ej = di, bj
M[1,1:m+1] = h'
convmtx!(view(M, 1 .+ (1:ai), 1:aj), v, m + 1)
convmtx!(view(M, 1 .+ (1:bi), aj .+ (1:bj)), u, k + 1)
convmtx!(view(M, (1 + ai) .+ (1:di), 1:dj), w, m + 1)
convmtx!(view(M, (1 + ci) .+ (1:fi), dj + ej .+ (1:fj)), u, j + 1)
end
## compute F(u,v,w) - [p, p'] = [u*v, u*w] - [p, p']
function Fmp!(b, h, β, p, q, u, v, w)
b[1] = h'*u - β
b[2:2+length(p)-1] = u ⊗ v .- p
b[2+length(p):end] = u ⊗ w .- q
end
function residual_error(p, q, u, v, w)
norm( [u ⊗ v; u ⊗ w] .- [p; q], 2)
end
## --------------------------------------------------
## utils
_degree(p::Vector) = length(p) - 1
"""
conv(p,q)
Convolution of two vectors (or product of two polynomials).
Also `⊗`
"""
function conv(p::Vector{T}, q::Vector{T}) where {T}
R = promote_type(T,eltype(q))
n = length(p)-1
m = length(q)-1
a = zeros(R, m+n+1)
@inbounds for i = 1:n+1
for j = 1:m+1
a[i+j-1] += p[i] * q[j]
end
end
a
end
⊗(p::Vector{T}, q::Vector{T}) where {T} = conv(p,q)
"""
convmtx(v, n::Int)
convmtx!(M,v,n)
convmtx_size(v,n)
Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v.
If q is a column vector of length n, then C*q is the same as conv(v,q).
"""
function convmtx!(C, v::AbstractVector{T}, n::Int) where T
# Form C as the Toeplitz matrix
# C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code inline
nv = length(v)-1
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
nothing
end
convmtx_size(v::AbstractVector, n) = (n + length(v) - 1, n)
function convmtx(v::AbstractVector{T}, n::Int) where {T}
C = zeros(T, convmtx_size(v, n)...)
convmtx!(C, v, n)
C
end
end
This is a difficult example, for which an automatic choice of tolerances is very challenging. For a tolerance atol = 1.e-16
, the correct result can be obtained with my implementation for the case n = 12
.
x = Polynomial([0,1]);
n = 12;
p = coeffs((x-1)^n*(x-2)^n*(x-3));
q = coeffs((x-1)*(x-2)*(x-4));
d = coeffs((x-1)*(x-2)); # exact GCD
d2, = polgcdvw(p,q,atol=1.e-16,maxnit = 10);
roots(Polynomial(d2))
@test norm(d*d2[end] - d2*d[end]) / norm(d) < 1.e-7
julia> roots(Polynomial(d2))
2-element Array{Float64,1}:
1.0
1.9999999999999991
julia> @test norm(d*d2[end] - d2*d[end]) / norm(d) < 1.e-7
Test Passed
The critical computation is the determination of the nullity of the Sylvester matrix. If this computation fails, then I guess all methods should have the same difficulties to determine a correct GCD.
For example, for n = 20
, the absolute tolerance for rank determination must lie between 4.81e-17
and 2.01e-17
, which in my opinion is difficult to be chosen automatically.
Okay, I think this catches the bugs in the last one, and cleans up the interface. You are right some problems are just hard here due to the tolerances, not only with identifying the rank, but also if the proposed triple is close enough. I separated out the two uses, should someone want to adjust them, and tried to identify reasonable relative tolerances. This version does okay on the test cases in the Zeng paper, except for Test 2, which I'm not sure I believe is set up properly. For these problems the scaling of the polynomials used ends up being really important. At one point I had some preconditioning code, but for now scaling is up to the caller.
I think I'll make a PR with this, pending your comments
module NGCD
using LinearAlgebra
"""
ngcd(ps::Vector{T}, qs::Vector{T}, [k::Int]; atol=eps(T), rtol=eps(T), satol=atol, srtol=rtol)
Return u,v,w, ρ, κ where `u⋅v ≈ ps` and `u⋅w ≈ qs`; Θ (`\\Theta[tab]`) is the residual error (`|| [u⋅v,u⋅w] - [ps,qs] ||`); and `κ` (`\\kappa[tab]`) is the numerical gcd condition number estimate.
The numerical GCD problem is defined in [1] (5.4). Let (p,q) be a
polynomial pair with degree m,n. Let Ρmn be set of all such pairs. Any
given pair of polynomials has an exact greatest common divisor, u, of
degree k, defined up to constant factors. Let Ρᵏmn be the manifold of
all such (p,q) pairs with exact gcd of degree k. A given pair (p,q) with exact gcd of degree j will
have some distance Θᵏ from Pᵏ. For a given threshold ϵ > 0 a numerical GCD
of (p,q) within ϵ is an exact GCD of a pair (p̂,q̂) in Ρᵏ with
|| (p,q) - (p̂,q̂) || <= Θᵏ, where k is the largest value for which Θᵏ < ϵ.
(In the ϵ → 0 limit, this would be the exact GCD.)
Suppose (p,q) is an ϵ pertubation from (p̂,q̂) where (p̂,q̂) has an exact gcd of degree k, then degree(gcdₑ(p,q)) = k; as ϵ → 0, gcdₑ(p,q) → gcd(p̂, q̂), and
limsup_{(p,q)→(p̂,q̂)} inf{ || (u,v,w) - (û,v̂,ŵ) ||} / || (p,q) - (p̂,q̂) || < κₑ(p,q).
κ is called the numerical GCD condition number.
The Zeng algorithm proposes a degree for `u` and *if* a triple `(u,v,w)` with `u` of degree `k` and (u⋅v, u⋅w) in Ρᵏmn can be found satisfying || (u⋅v, u⋅w) - (p,q) || < ϵ then (u,v,w) is returned; otherwise the proposed degree is reduced and the process repeats. If not terminated, at degree 0 a constant gcd is returned.
The initial proposed degree is the first `j`, `j=min(m,n):-1:1`, where Sⱼ is believed to have a singular value of 0 (Sⱼ being related to the Sylvester matrix of `ps` and `qs`). The verification of the proposed degree is done using a Gauss-Newton iteration scheme holding the degree of `u` constant.
Tolerances:
There are two places where tolerances are utilized:
* in the identification of the rank of Sⱼ a value σ₁ = ||Rx|| is identified. To test if this is zero a tolerance of `max(satol, ||R||ₒₒ ⋅ srtol)` is used.
* to test if (u ⋅ v, u ⋅ w) ≈ (p,q) a tolerance of `max(atol, λ⋅rtol)` is used with `λ` an estimate for κ⋅||(p,q)||.
Specified degree:
When `k` is specified, a value for `(u,v,w)` is identified with `degree(u)=k`. No tolerances are utilized in computing Θᵏ.
!!! note:
These tolerances are adjusted from those proposed in [1].
Output:
The function outputs a named tuple with names (`u`, `v`, `w`, `Θ`, `κ`). The components `u`,`v`,w` estimate the gcd and give the divisor. The value `Θ` estimates Θᵏ and `κ` estimates the numerical condition number.
Example:
using Polynomials x = variable(Polynomial{Float64}) p = (x+10)(x^9 + x^8/3 + 1) q = (x+10)(x^9 + x^8/7 - 6/7) gcd(p,q) # u a constant ngcd(coeffs(p),coeffs(q)) # u a degree 1 polynomial
Reference:
[1] The Numerical Greatest Common Divisor of Univariate Polynomials
by Zhonggang Zeng;
[url](http://homepages.neiu.edu/~zzeng/uvgcd.pdf);
[doi](https://doi.org/10.1090/conm/556/11014)
Note: Based on work by Andreas Varga
"""
function ngcd(ps::Vector{T},
qs::Vector{T};
atol=(length(ps)+length(qs))*eps(real(T)),
rtol=eps(real(T)),
satol=atol,
srtol=rtol,
verbose=false
) where {T <: AbstractFloat}
m, n = _degree.((ps, qs))
if m < n
out = ngcd(qs, ps; atol=atol, rtol=rtol, satol=satol, srtol=srtol, verbose=verbose)
return (u=out.u, v=out.w, w=out.v, Θ=out.Θ, κ=out.κ)
end
# storage
A0 = zeros(T, m+1, 2)
A0[:,1] = ps
A0[end-n:end,2] = qs
# pre-allocate storage for Sylvester Matrices, S₁, S₂...
Q = zeros(T, m + n, m + n)
R = zeros(T, m + n, m + n)
Sₓ = hcat(convmtx(ps,1), convmtx(qs, m-n+1))
local x::Vector{T}
j = n # We count down Sn, S_{n-1}, ..., S₂, S₁
F = qr(Sₓ)
nr, nc = size(Sₓ) # m+1, m-n+2
Q[1:nr, 1:nr] = F.Q
R[1:nc, 1:nc] = F.R
while true
V = view(R, 1:nc, 1:nc)
flag, σ, x = smallest_singular_value(V, satol * sqrt(1 + m - j), srtol)
verbose && println("------ degree $j ----- σ₁: $σ --- $flag")
if (flag == :iszero || flag == :ispossible)
u, v, w = initial_uvw(Val(flag), j, ps, qs, x)
flag, ρ₁, σ₂ = refine_agcd!(u,v,w, ps, qs, atol, rtol)
verbose && println(" --- Θᵏ: $ρ₁ --- $flag")
if flag == :convergence
return (u=u, v=v, w=w, Θ=ρ₁, κ=σ₂) # (u,v,w) verified
end
end
# reduce possible degree of u and try again with Sⱼ₋₁
j -= 1
nr += 1
nc += 2
nc > nr && break
extend_QR!(Q,R, nr, nc, A0) # before Q⋅R = Sⱼ, now Q⋅R = Sⱼ₋₁
end
# u is a constant
u, v, w = initial_uvw(Val(:constant), j, ps, qs, x)
flag, ρ₁, κ = refine_agcd!(u,v,w, ps, qs, atol, rtol)
return (u=u, v=v, w=w, Θ=ρ₁, κ=κ)
end
function ngcd(ps::Vector{T}, qs::Vector{S}; kwargs...) where {T, S}
ps′,qs′ = promote(float.(ps), float(qs))
ngcd(ps′, qs′; kwargs...)
end
# fix the degree, k
function ngcd(ps::Vector{T},
qs::Vector{T},
k::Int
;
kwargs...
) where {T <: AbstractFloat}
m, n = _degree.((ps,qs))
if m < n
out = ngcd(qs, ps, k, atol=atol, rtol=rtol)
return (u=out.u, v=out.w, w=out.v, Θ=out.Θ, κ=out.κ)
end
u,v,w = initial_uvw(Val(:iszero), k, ps, qs, nothing)
flag, ρ₁, κ = refine_agcd!(u,v,w, ps, qs, Inf, Inf)
return (u=u, v=v, w=w, Θ=ρ₁, κ=κ)
end
function(ps::Vector{T}, qs::Vector{S}, k::Int; kwargs...) where {T, S}
ps′,qs′ = promote(float.(ps), float(qs))
ngcd(ps′, qs′, k; kwargs...)
end
## -----
# return guess at smallest singular value and right sinuglar value, x
# for a right triangular matrix, V
function smallest_singular_value(V::AbstractArray{T,2},
atol=eps(T),
rtol=zero(T)) where {T}
R = UpperTriangular(V)
k = size(R)[1]/2
if iszero(det(R)) #
return (:iszero, zero(T), T[])
end
m,n = size(R)
# we are testing if ||Ax|| ≈ 0
# If x is a perfect 0, but x is approximated by x' due to round off
# then ||A(x-x')|| <= ||A||⋅||x - x'|| so we use ||A|| as scale factor
δ = max(atol, norm(R,Inf) * rtol)
x = ones(T, n)
y = zeros(T, m)
σ₀ = σ₁ = Inf*one(real(T))
steps, minᵢ = 1, 5
while true
y .= R' \ x # use iteration to converge on singular value
x .= R \ y
x ./= norm(x,2)
σ₁ = norm(R * x, 2)
if (steps <= 50) && (steps <= minᵢ || σ₁ <= 0.05*σ₀) # decreasing, keep going
σ₀ = σ₁
else
break
end
steps += 1
end
if σ₁ < δ
return (:ispossible, σ₁, x)
else
return (:constant, σ₁, x)
end
end
# extend QR to next size
# Q gets a 1 in nc,nc, 0s should be elswhere
function extend_QR!(Q,R, nr, nc, A0)
#old Q is m x m, old R is n x n; we add to these
n = nc-2
m = nr - 1
k,l = size(A0)
# add two columns to R
# need to apply Q to top part of new columns
R[nr-k+1:nr, (nc-1):nc] = A0
R[1:nr-1, (nc-1):nc] = (view(Q, 1:nr-1, 1:nr-1))' * R[1:nr-1, (nc-1):nc]
# extend Q with row and column with identity
Q[nr,nr] = 1
# Make R upper triangular using Givens rotations
for j in nr-1:-1:nc-1
gj,_ = givens(R[j,nc-1], R[j+1,nc-1], j, j+1)
rmul!(Q, gj')
lmul!(gj, R)
end
for j in nr-1:-1:nc
gj,_ = givens(R[j,nc], R[j+1,nc], j, j+1)
rmul!(Q, gj')
lmul!(gj, R)
end
return nothing
end
## --------------------------------------------------
## Refine u,v,w
## Find u₀,v₀,w₀ from right singular vector
function initial_uvw(::Val{:ispossible}, j, ps, qs, x)
# Sk*[w;-v] = 0, so pick out v,w after applying permuation
m,n = _degree.((ps, qs))
vᵢ = vcat(2:m-n+2, m-n+4:2:length(x))
wᵢ = vcat(1, (m-n+3):2:length(x))
v = -x[vᵢ]
w = x[wᵢ]
# p194 3.9 C_k(v) u = p or Ck(w) u = q; this uses 10.2
u = [convmtx(v,j+1); convmtx(w, j+1)] \ [ps; qs]
return u, v, w
end
function initial_uvw(::Val{:iszero}, j, ps::Vector{T}, qs, x) where {T}
m,n = _degree.((ps,qs))
S = [convmtx(ps, n-j+1) convmtx(qs, m-j+1)]
F = qr(S)
R = UpperTriangular(F.R)
x = ones(T, size(R,2))
x .= R \ (R' \ (x/norm(x)))
x ./= norm(x)
w = x[1:n-j+1]
v = -x[(n-j+2):end]
u = [convmtx(v, j+1); convmtx(w, j+1)] \ [ps;qs]
(u,v,w)
end
function initial_uvw(::Val{:constant}, j, ps::Vector{T}, qs, x) where {T}
u = ones(T,1)
w = qs
v = ps
u,v,w
end
# find estimate for σ₂, used in a condition number (κ = 1/σ)
function σ₂(J)
F = qr(J)
flag, σ, x = smallest_singular_value(F.R)
σ
end
## attempt to refine u,v,w
## check that [u ⊗ v; u ⊗ w] ≈ [p; q]
function refine_agcd!(u::Vector{T}, v, w, p, q, atol, rtol) where {T}
m, n, l = _degree.((u, v, w))
uv = u ⊗ v; uw = u ⊗ w
ρₒ, ρ₁ = one(T), residual_error(p,q,uv,uw)
# storage
A = zeros(T, JF_size(u, v, w)...)
b = zeros(T, 1 + length(p) + length(q))
Δf = zeros(T, m + n + l + 3)
steps = 0
h, β = u, norm(u)^2
minᵢ, Maxᵢ = 5, 20
κ = NaN
JF!(A, h, u, v, w)
Fmp!(b, h'*u - β, p, q, uv, uw)
# sensitivity is Δu / Δp -> || J+ ||
# so we would use cond(A)/norm(A) * ||[p;q]|| as scale factor
# we estimate κ directly here
#λ = norm(A, Inf)
λ = norm((norm(p), norm(q))) / σ₂(A)
ρ = max(atol, rtol*λ)
while true
# update A, b, then solve A\b
Δf .= A \ b
Δu, Δv, Δw = Δf[1:(m+1)], Δf[(m+2):(m+n+2)], Δf[(m+n+3):end]
uv, uw = (u-Δu) ⊗ (v-Δv), (u-Δu) ⊗ (w-Δw)
ρ₀, ρ′ = ρ₁, residual_error(p, q, uv, uw)
# don't worry about first few, but aftewards each step must be productive
if (steps <= Maxᵢ) && (steps <= minᵢ || ρ′ < 0.05 * ρ₀)
ρ₁ = ρ′
u .-= Δu; v .-= Δv; w .-= Δw
steps += 1
else
break
end
# update A,b for next iteration
JF!(A, h, u, v, w)
Fmp!(b, h'*u - β, p, q, uv, uw)
end
if ρ₁ <= ρ
κ = 1/σ₂(A)
return :convergence, ρ₁, κ
else
return :no_convergence, ρ₁, NaN
end
end
## Jacobian F(u,v,w) = [p,p'] is J(u,v,w)
function JF_size(u, v, w)
m, k, j = _degree(u), _degree(v), _degree(w)
n, l = m + k, m + j
ai, aj = convmtx_size(v, m + 1)
bi, bj = convmtx_size(u, k + 1)
di, dj = convmtx_size(w, m + 1)
fi, fj = convmtx_size(u, j + 1)
ci, cj = ai, fj
ei, ej = di, bj
(1 + ai + di, aj + bj + cj)
end
# Jacobian matrix
function JF(u::Vector{U}, v::Vector{V}, w::Vector{W}) where {U,V,W}
R = promote_type(U,V, W)
M = zeros(R, JF_size(u, v, w)...)
JF!(M, u, v, w)
M
end
function JF!(M, h, u::Vector{T}, v, w) where {T}
m, k, j = _degree(u), _degree(v), _degree(w)
n, l = m + k, m + j
# JF_size should return these
ai, aj = convmtx_size(v, m + 1)
bi, bj = convmtx_size(u, k + 1)
di, dj = convmtx_size(w, m + 1)
fi, fj = convmtx_size(u, j + 1)
ci, cj = ai, fj
ei, ej = di, bj
M[1,1:m+1] = h'
convmtx!(view(M, 1 .+ (1:ai), 1:aj), v, m + 1)
convmtx!(view(M, 1 .+ (1:bi), aj .+ (1:bj)), u, k + 1)
convmtx!(view(M, (1 + ai) .+ (1:di), 1:dj), w, m + 1)
convmtx!(view(M, (1 + ci) .+ (1:fi), dj + ej .+ (1:fj)), u, j + 1)
end
## compute F(u,v,w) - [p, p'] = [u*v, u*w] - [p, p']
function Fmp!(b, γ, p, q, uv, uw)
b[1] = γ
for i in 2:2+length(p)-1
j = i - 1
b[i] = uv[j] - p[j]
end
for i in 2+length(p):length(b)
j = i - 1 - length(p)
b[i] = uw[j] - q[j]
end
return nothing
end
# norm( [u ⊗ v; u ⊗ w] .- [p; q], 2)
function residual_error(p::Vector{T}, q, uv, uw) where {T}
tot = zero(T)
for (pᵢ, uvᵢ) in zip(p,uv)
tot += (pᵢ-uvᵢ)^2
end
for (qᵢ, uwᵢ) in zip(q, uw)
tot += (qᵢ-uwᵢ)^2
end
sqrt(tot)
end
## --------------------------------------------------
## utils
_degree(p::Vector) = length(p) - 1
"""
conv(p,q)
Convolution of two vectors (or product of two polynomials).
Also `⊗`
"""
function conv(p::Vector{T}, q::Vector{T}) where {T}
R = promote_type(T,eltype(q))
n = length(p)-1
m = length(q)-1
a = zeros(R, m+n+1)
@inbounds for i = 1:n+1
for j = 1:m+1
a[i+j-1] += p[i] * q[j]
end
end
a
end
⊗(p::Vector{T}, q::Vector{T}) where {T} = conv(p,q)
"""
convmtx(v, n::Int)
convmtx!(M,v,n)
convmtx_size(v,n)
Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v.
If q is a column vector of length n, then C*q is the same as conv(v,q).
"""
function convmtx!(C, v::AbstractVector{T}, n::Int) where T
# Form C as the Toeplitz matrix
# C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code inline
nv = length(v)-1
@inbounds for j = 1:n
C[j:j+nv,j] = v
end
nothing
end
convmtx_size(v::AbstractVector, n) = (n + length(v) - 1, n)
function convmtx(v::AbstractVector{T}, n::Int) where {T}
C = zeros(T, convmtx_size(v, n)...)
convmtx!(C, v, n)
C
end
end
One issue which could target a last refinement of computational efficiency is in the iterative refinement phase. Here the employed matrix JF
has the block structure [* 0 0; * * 0; * 0 *]
and corresponds to the targeted solution [u; v; w]
and right hand side Δf = [h'd; v*d; w*d] - [β; p; q]
The matrix JF
has a small amount of trailing zeros in its first columns, so using a QR-decomposition based solution benefits only a little of this structure. It would increase the efficiency of solving the equation Jh*Δz = Δf
if JF
is generated with an alternative block structure [* 0 *; 0 * *; 0 0 *]
which corresponds to [v; w; u]
and Δf = [v*d; w*d; h'd ] - [p; q; β ]
. The reasons for the gain of efficiency is explained here. Finally, to get the expected benefit, instead of Δf .= A \ b
, you could use Δf .= qr(A) \ b
, which exploits the zero structure. I wonder if the gain of efficiency is worth to restructure you implementation.
If helpful, here is the last version of my implementation of the updating phase:
function gcdvwupd(a::Vector{T}, b::Vector{T}, d::Vector{T}, v::Vector{T}, w::Vector{T}; maxnit::Int = 10) where {T <: Number}
# Iterative refinement of the accuracy of the greatest common divisor of two polynomials.
# gcdvwupd(a, b, d, v, w; maxnit = 0) -> (dupd, vupd, wupd, δ)
# Given the polynomials a and b, and a triple of polynomials (d, v, w) such that
# `d` is an approximation of a greatest common divisor of `a` and `b`, and
# v and w are approximate quotients of a/d and b/d, respectively, compute the updated triple
# (dupd, vupd, wupd) which minimizes the accuracy estimate δ = norm([a-dupd*vupd; b-dupd*wupd]).
# A maximum number of maxnit iterations are performed using the Gauss-Newton iteration method,
# in the form proposed in [1].
#
# [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials,
# in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),
# Randomization, Relaxation,and Complexity in Polynomial Equation Solving,
# Contemporary Mathematics,vol.556,AMS,2011,pp.187–217.
na1 = length(a)
nb1 = length(b)
nd1 = length(d)
nv1 = length(v)
nw1 = length(w)
na1 == nd1+nv1-1 || error("Incompatible dimensions between a, d and v")
nb1 == nd1+nw1-1 || error("Incompatible dimensions between b, d and w")
h = d/norm(d)^2
f = [a;b;one(T)]
fh = [conv(v,d); conv(w,d); dot(h,d)]
Δf = f-fh;
δ = norm(Δf)
d0 = copy(d); v0 = copy(v); w0 = copy(w);
nvw1 = nv1+nw1
ndvw1 = nd1+nvw1
nab1 = na1+nb1
Jh = [zeros(nab1,ndvw1); zeros(1,nvw1) h' ]
j1 = 1:nv1
j2 = nv1+1:nvw1
j3 = nvw1+1:ndvw1
i1 = 1:nv1+nd1-1
i2 = nv1+nd1:nvw1+2*nd1-2
Jh11 = view(Jh,i1,j1)
Jh22 = view(Jh,i2,j2)
Jh13 = view(Jh,i1,j3)
Jh23 = view(Jh,i2,j3)
for it = 1:maxnit
# Jh = [ convmtx(d0,nv1) 0 convmtx(v0,nd1) ]
# [ 0 convmtx(u0,nw1) convmtx(w0,nd1) ]
# [ 0 0 h' ]
Jh11[:,:] = convmtx(d0,nv1)
Jh22[:,:] = convmtx(d0,nw1)
Jh13[:,:] = convmtx(v0,nd1)
Jh23[:,:] = convmtx(w0,nd1)
# Δz = Jh\Δf; # this does not exploit the zero structure
# Δz = qrsolve!(copy(Jh),copy(Δf)); # this is probably the fastest
Δz = qr(Jh)\Δf; # this is somewhat slower, but still does the job
v1 = v0 + Δz[j1];
w1 = w0 + Δz[j2];
d1 = d0 + Δz[j3];
fh1 = [conv(v1,d1); conv(w1,d1); dot(h,d1)];
Δf1 = f-fh1;
δ1 = norm(Δf1)
if δ1 < δ
d0 = copy(d1); v0 = copy(v1); w0 = copy(w1); δ = copy(δ1); Δf = copy(Δf1);
else
break
end
end
return d0, v0, w0, δ
end
Somewhat faster is to use the line commented out, which calls the following function:
function qrsolve!(A::AbstractMatrix{T},b::AbstractVector{T}) where {T}
# Fast least-squares solver for full column rank Hessenberg-like matrices
require_one_based_indexing(A)
m, n = size(A)
m < n && error("Column dimension exceeds row dimension")
_, τ = LinearAlgebra.LAPACK.geqrf!(A)
T <: Complex ? tran = 'C' : tran = 'T'
LinearAlgebra.LAPACK.ormqr!('L',tran,A,τ,view(b,:,1:1))
return UpperTriangular(triu(A[1:n,:]))\b[1:n]
end
Thanks! Let me make that adjustment. Hopefully, I'll have the PR ready by today.
A small correction of timings: the qr(A)\b
form is faster for full matrices (because of using blocking algorithm), but not for triangular and Hessenberg-like matrices:
n = 1000; m = 700;
A = rand(n,m); A10 = triu(a,-10); At = triu(a); b = rand(n);
@time qr(A)\b;
@time qr(At)\b;
@time qr(A10)\b;
Timings:
julia> @time qr(A)\b;
0.070456 seconds (18 allocations: 9.483 MiB)
julia> @time qr(At)\b;
0.064726 seconds (18 allocations: 9.483 MiB)
julia> @time qr(A10)\b;
0.070714 seconds (18 allocations: 9.483 MiB)
The qrsolve!
based variant is faster for triangular and Hessenberg-like shapes:
n = 1000; m = 700;
A = rand(n,m); A10 = triu(a,-10); At = triu(a); b = rand(n);
@time qrsolve!(A,b);
@time qrsolve!(At,b);
@time qrsolve!(A10,b);
Timings:
julia> @time qrsolve!(A,b);
0.094381 seconds (18 allocations: 7.697 MiB)
julia> @time qrsolve!(At,b);
0.028213 seconds (18 allocations: 7.697 MiB)
julia> @time qrsolve!(A10,b);
0.028316 seconds (18 allocations: 7.697 MiB)
The A\b form is the slowest:
n = 1000; m = 700;
A = rand(n,m); A10 = triu(a,-10); At = triu(a); b = rand(n);
@time A\b;
@time At\b;
@time A10\b;
Timings:
julia> @time A\b;
0.147010 seconds (5.65 k allocations: 13.607 MiB, 6.20% gc time)
julia> @time At\b;
0.129677 seconds (5.57 k allocations: 13.593 MiB)
julia> @time A10\b;
0.130805 seconds (5.65 k allocations: 13.762 MiB)
Therefore, the only option remains to use the qrsolve!
based computation.
Just for fun, I played with your example for larger values of n
. This is what I got for n = 50
using polynomial long division as a first step, which in this case provides the exact solution (without using any tolerance based decision):
x = Polynomial([0,1]);
n = 50;
p1 = (x-1)^n*(x-2)^n*(x-3);
q1 = (x-1)*(x-2)*(x-4);
p = coeffs(p1); q = coeffs(q1);
d1, r1 = divrem(p1/norm(p),q1/norm(q));
roots(Polynomial(r1[0:2]))
The almost exact result is contained inr1[0:2]
:
julia> roots(Polynomial(r1[0:2]))
2-element Array{Float64,1}:
0.9999999999999996
2.000000000000001
I recognize I was surprized seeing
julia> r1[0:2]
3-element Array{Float64,1}:
-1.4396085506034197e51
2.1594128259051298e51
-7.198042753017099e50
and also seeing very good backward errors:
norm(coeffs(p1/norm(p)-d1*(q1/norm(q))-r1))/norm(r1[0:2])
9.95555419410239e-19
So, I wonder if in the case when there exists a large gap between the degrees of the two polynomials, it would be a good idea to perform such a first step, and then work as usual with much smaller dimensions ? In such cases, it would have sense to focus on obtaining (somehow) the remainder with high accuracy (this computation is, according to Zeng, ill-conditioned).
Interesting idea. I was just about to push a PR. I have added it in, but some suggestions on a heuristic as to when to call it are needed. (Also a name, I just add a prime
for now.) I was ready to just let that example go, as other cases seemed to be performing well. In the PR I scale by the norm by default. That seems to give much better success.
Hi @andreasvarga - You can work with this PR or just copy and paste the ngcd.jl
file to play with this. As usual, any feedback is most welcome.
The following randomly generated example fails to produce a correct GCD:
I wonder if adding
atol
andrtol
as keyword parameters togcd
would be an option to enhance this function.