JuliaMath / Polynomials.jl

Polynomial manipulations in Julia

http://juliamath.github.io/Polynomials.jl/

Other

303 stars 74 forks source link

gcd failure #240

Closed andreasvarga closed 4 years ago

andreasvarga commented 4 years ago

The following randomly generated example fails to produce a correct GCD:

a = Polynomial([0.8457170323029561, 0.47175077674705257,  0.9775441940117577]);
b = Polynomial([0.5410010714904849, 0.533604905984294]);
d = Polynomial([0.5490673726445683, 0.15991109487875477]);

gcd(num*d,den*d)   # computes Polynomial(8.326672684688674e-16), which is obviously wrong

I wonder if adding atol and rtol as keyword parameters to gcd would be an option to enhance this function.

jverzani commented 4 years ago

The gcd calculation bumps into floating point issues quite quickly. If you have some ideas how to modify the current algorithm (the truncate call perhaps?) why not try them out and put in a PR. I'm not sure how to do so. At one point I tried implementing an approximate gcd algorithm based on https://doi.org/10.1090/S0025-5718-04-01692-8, but that code has some bitrot now and doesn't seem to help here.

andreasvarga commented 4 years ago

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(real(T)),) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

#     d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor `d` of two polynomials `a` and

b using Blankinship's algorithm.

na1 = findlast(!iszero,a) # degree(a) + 1
na1 === nothing && (na1 = 0)
nb1 = findlast(!iszero,b) # degree(b) + 1
nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

na1 <= 0 && (return ones(R,1))
nb1 <= 0 && (return ones(R,1))

a1, b1,  = promote(a[1:na1], b[1:nb1], R[])

max_coeff = max(maximum(abs.(a1)),maximum(abs.(b1)))
tol = max_coeff * rtol + atol
while na1 >= 1
    na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
    @inbounds for i in na1:-1:nb1
        s = -a1[i] / b1[nb1]
        @inbounds for j in 1:nb1-1
            a1[i-nb1+j] += b1[j] * s
        end
    end
    na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
    na1 === nothing && (na1 = 0)
    a1 = a1[1:na1]
end
return nb1 == 1 ? [one(R)] : b1

end

For my own purposes to manipulate rational matrices, I implemented a version of gcd inspired by your code in divrem and an earlier implementation which I had in MATLAB. If you think it is helpful, you can include this code or part of it into the gcd routine in Polynomials. My code works only on the coefficients.

Am Do., 2. Juli 2020 um 13:53 Uhr schrieb john verzani < notifications@github.com>:

The gcd calculation bumps into floating point issues quite quickly. If you have some ideas how to modify the current algorithm (the truncate call perhaps?) why not try them out and put in a PR. I'm not sure how to do so. At one point I tried implementing an approximate gcd algorithm based on https://doi.org/10.1090/S0025-5718-04-01692-8, but that code has some bitrot now and doesn't seem to help here.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-652960753, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHEBFRYICFT6AE2LVWW3RZRYK5ANCNFSM4OOYT3IA .

andreasvarga commented 4 years ago

Here is a formatted version of previous comment:

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(real(T)),) where {T <: Number, S <: Number }
    # Greatest common divisor of two polynomials.
    #     d = gcd1(a,b; atol = 0, rtol) 
    # Compute the greatest common divisor `d` of two polynomials `a` and `b` using Blankinship's algorithm.

    na1 = findlast(!iszero,a) # degree(a) + 1
    na1 === nothing && (na1 = 0) 
    nb1 = findlast(!iszero,b) # degree(b) + 1
    nb1 === nothing && (nb1 = 0)

    R = eltype(one(T)/one(S))

    na1 <= 0 && (return ones(R,1))
    nb1 <= 0 && (return ones(R,1))

    a1, b1,  = promote(a[1:na1], b[1:nb1], R[])

    max_coeff = max(maximum(abs.(a1)),maximum(abs.(b1)))
    tol = max_coeff * rtol + atol
    while na1 >= 1
        na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
        @inbounds for i in na1:-1:nb1
            s = -a1[i] / b1[nb1]
            @inbounds for j in 1:nb1-1
                a1[i-nb1+j] += b1[j] * s
            end
        end
        na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
        na1 === nothing && (na1 = 0) 
        a1 = a1[1:na1] 
    end
    return nb1 == 1 ? [one(R)] : b1
end

For my own purposes to manipulate rational matrices, I implemented a version ofgcd inspired by your code in divrem and an earlier implementation which I had in MATLAB. If you think it is helpful, you can include this code or part of it into thegcd routine inPolynomials. My code works only on the coefficients.

jverzani commented 4 years ago

Thanks! Let me have a look. At first glance it looks similar to using chop in place of truncate, which makes sense.

andreasvarga commented 4 years ago

I implemented two new prototype GCD's, which produce better accuracy gcd's.

The first is just an enhanced Euclidian algorithms, using some internal scaling:

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
    # Greatest common divisor of two polynomials.
    #     d = gcd1(a,b; atol = 0, rtol) 
    # Compute the greatest common divisor `d` of two polynomials `a` and `b` using 
    # the Euclidian Algorithm with scaling as of [1]. 
    #  [1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to ill-conditioned 
    #      algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

    na1 = findlast(!iszero,a) # degree(a) + 1
    na1 === nothing && (na1 = 0) 
    nb1 = findlast(!iszero,b) # degree(b) + 1
    nb1 === nothing && (nb1 = 0)

    R = eltype(one(T)/one(S))

    na1 <= 0 && (return ones(R,1))
    nb1 <= 0 && (return ones(R,1))

    a1, b1,  = promote(a[1:na1], b[1:nb1], R[])

    a1 = a1/norm(a1)
    b1 = b1/norm(b1)
    atol == 0 ? tol = rtol : tol = atol
    # determine the degree of GCD as the nullity of the Sylvester matrix 
    # this computation can be replaced by simply setting nd = 1, in which case the Sylvester matrix is not formed
    nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
    nd == 0 && (return [one(R)]) 
    while na1 > nd
         na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))
         sc = 1
         @inbounds for i in na1:-1:nb1
            s = -a1[i] / b1[nb1]
            sc = max(sc,abs(s))
            @inbounds for j in 1:nb1-1
                a1[i-nb1+j] += b1[j] * s
            end
        end
        a1 = a1/sc
        na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])
        na1 === nothing && (na1 = 0) 
        a1 = a1[1:na1] 
    end
    return nb1 == 1 ? [one(R)] : b1
end

The second function is usually more robust and computes additionally the factorsv = a/d and w = v/d.

function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }
    # Greatest common divisor of two polynomials.
    #     gcd2(a,b; atol = 0, rtol) -> (d, v, w)
    # Compute the greatest common divisor `d` of two polynomials `a` and `b`, and 
    # the polynomials v and w such that a = d * v and b = d * w, using a SVD-based method
    # adapted from [1] (without iterative accuracy refinement). 
    #  [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials, 
    #      in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.), 
    #      Randomization, Relaxation,and Complexity in Polynomial Equation Solving, 
    #      Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

    na1 = findlast(!iszero,a) # degree(a) + 1
    na1 === nothing && (na1 = 0) 
    nb1 = findlast(!iszero,b) # degree(b) + 1
    nb1 === nothing && (nb1 = 0)

    R = eltype(one(T)/one(S))

    a1, b1,  = promote(a[1:na1], b[1:nb1], R[])
    na1 <= 0 && (return ones(R,1), a1, b1)
    nb1 <= 0 && (return ones(R,1), a1, b1)

    atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol
    # determine the degree of GCD as the nullity of the Sylvester matrix 
    nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
    nd == 0 && (return [one(R)], a1, b1) 
    # determine [w; -v] from an orthogonal/unitary nullspace basis of dimension 1 
    # of a reduced Sylvester matrix using the last row of Vt from its (partial) SVD 
    wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:] 
    eltype(wv) <: Complex && (wv = conj(wv))  
    v = -wv[nb1-nd+1:end]
    w = wv[1:nb1-nd]
    # determine the GCD d with high accuracy as the solution of a well-conditioned 
    # linear least-squares problem
    d = convmtx(w,nd+1)\b1 
    return d, v, w
end

Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).

function convmtx(v::AbstractVector{T}, n::Int) where T

    #   Convolution matrix.
    #   C = convmtx(v,n) returns the convolution matrix C for a vector v. 
    #   If q is a column vector of length n, then C*q is the same as conv(v,q). 

    #   Form C as the Toeplitz matrix 
    #   C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1));  put Toeplitz code inline

    nv = length(v)-1
    C = zeros(T, n+nv, n)
    @inbounds for j = 1:n
        C[j:j+nv,j] = v  
    end
    return C
end

I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.

jverzani commented 4 years ago

Nice! I can package into a PR if you want or you can. Just let me know.

On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga notifications@github.com wrote:

I implemented two new prototype GCD's, which produce better accuracy gcd's.

The first is just an enhanced Euclidian algorithms, using some internal scaling:

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

# Greatest common divisor of two polynomials.

#     d = gcd1(a,b; atol = 0, rtol)

# Compute the greatest common divisor `d` of two polynomials `a` and `b` using

# the Euclidian Algorithm with scaling as of [1].

#  [1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to ill-conditioned

#      algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

na1 <= 0 && (return ones(R,1))

nb1 <= 0 && (return ones(R,1))

a1, b1,  = promote(a[1:na1], b[1:nb1], R[])

a1 = a1/norm(a1)

b1 = b1/norm(b1)

atol == 0 ? tol = rtol : tol = atol

# determine the degree of GCD as the nullity of the Sylvester matrix

# this computation can be replaced by simply setting nd = 1, in which case the Sylvester matrix is not formed

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)])

while na1 > nd

     na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))

     sc = 1

     @inbounds for i in na1:-1:nb1

        s = -a1[i] / b1[nb1]

        sc = max(sc,abs(s))

        @inbounds for j in 1:nb1-1

            a1[i-nb1+j] += b1[j] * s

        end

    end

    a1 = a1/sc

    na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])

    na1 === nothing && (na1 = 0)

    a1 = a1[1:na1]

end

return nb1 == 1 ? [one(R)] : b1

end

The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.

function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

# Greatest common divisor of two polynomials.

#     gcd2(a,b; atol = 0, rtol) -> (d, v, w)

# Compute the greatest common divisor `d` of two polynomials `a` and `b`, and

# the polynomials v and w such that a = d * v and b = d * w, using a SVD-based method

# adapted from [1] (without iterative accuracy refinement).

#  [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials,

#      in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

#      Randomization, Relaxation,and Complexity in Polynomial Equation Solving,

#      Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

a1, b1,  = promote(a[1:na1], b[1:nb1], R[])

na1 <= 0 && (return ones(R,1), a1, b1)

nb1 <= 0 && (return ones(R,1), a1, b1)

atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol

# determine the degree of GCD as the nullity of the Sylvester matrix

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)], a1, b1)

# determine [w; -v] from an orthogonal/unitary nullspace basis of dimension 1

# of a reduced Sylvester matrix using the last row of Vt from its (partial) SVD

wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]

eltype(wv) <: Complex && (wv = conj(wv))

v = -wv[nb1-nd+1:end]

w = wv[1:nb1-nd]

# determine the GCD d with high accuracy as the solution of a well-conditioned

# linear least-squares problem

d = convmtx(w,nd+1)\b1

return d, v, w

end

Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).

function convmtx(v::AbstractVector{T}, n::Int) where T

#   Convolution matrix.

#   C = convmtx(v,n) returns the convolution matrix C for a vector v.

#   If q is a column vector of length n, then C*q is the same as conv(v,q).

#   Form C as the Toeplitz matrix

#   C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1));  put Toeplitz code inline

nv = length(v)-1

C = zeros(T, n+nv, n)

@inbounds for j = 1:n

    C[j:j+nv,j] = v

end

return C

end

I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA .

-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu

andreasvarga commented 4 years ago

I would appreciate if you can do it.

john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 13:01:

Nice! I can package into a PR if you want or you can. Just let me know.

On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga notifications@github.com wrote:

I implemented two new prototype GCD's, which produce better accuracy gcd's.

The first is just an enhanced Euclidian algorithms, using some internal scaling:

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor d of two polynomials a and b

using

the Euclidian Algorithm with scaling as of [1].

[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to

ill-conditioned

algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

na1 <= 0 && (return ones(R,1))

nb1 <= 0 && (return ones(R,1))

a1, b1, = promote(a[1:na1], b[1:nb1], R[])

a1 = a1/norm(a1)

b1 = b1/norm(b1)

atol == 0 ? tol = rtol : tol = atol

determine the degree of GCD as the nullity of the Sylvester matrix

this computation can be replaced by simply setting nd = 1, in which

case the Sylvester matrix is not formed

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)])

while na1 > nd

na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))

sc = 1

@inbounds for i in na1:-1:nb1

s = -a1[i] / b1[nb1]

sc = max(sc,abs(s))

@inbounds for j in 1:nb1-1

a1[i-nb1+j] += b1[j] * s

end

end

a1 = a1/sc

na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])

na1 === nothing && (na1 = 0)

a1 = a1[1:na1]

end

return nb1 == 1 ? [one(R)] : b1

end

The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.

function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

gcd2(a,b; atol = 0, rtol) -> (d, v, w)

Compute the greatest common divisor d of two polynomials a and

b, and

the polynomials v and w such that a = d v and b = d w, using a

SVD-based method

adapted from [1] (without iterative accuracy refinement).

[1] Z. Zeng, The numerical greatest common divisor of univariate

polynomials,

in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

Randomization, Relaxation,and Complexity in Polynomial Equation

Solving,

Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

a1, b1, = promote(a[1:na1], b[1:nb1], R[])

na1 <= 0 && (return ones(R,1), a1, b1)

nb1 <= 0 && (return ones(R,1), a1, b1)

atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol

determine the degree of GCD as the nullity of the Sylvester matrix

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)], a1, b1)

determine [w; -v] from an orthogonal/unitary nullspace basis of

dimension 1

of a reduced Sylvester matrix using the last row of Vt from its

(partial) SVD

wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]

eltype(wv) <: Complex && (wv = conj(wv))

v = -wv[nb1-nd+1:end]

w = wv[1:nb1-nd]

determine the GCD d with high accuracy as the solution of a

well-conditioned

linear least-squares problem

d = convmtx(w,nd+1)\b1

return d, v, w

end

Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).

function convmtx(v::AbstractVector{T}, n::Int) where T

Convolution matrix.

C = convmtx(v,n) returns the convolution matrix C for a vector v.

If q is a column vector of length n, then C*q is the same as conv(v,q).

Form C as the Toeplitz matrix

C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code

inline

nv = length(v)-1

C = zeros(T, n+nv, n)

@inbounds for j = 1:n

C[j:j+nv,j] = v

end

return C

end

I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA

.

-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658702675, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHEF5W5SH6MSD4IC2653R3WECFANCNFSM4OOYT3IA .

jverzani commented 4 years ago

Do you have any thoughts on this naming scheme:

for gcd1 add a scale=true keyword argument to gcd to call gcd1. Maybe this should be scale=:noda_sasaki in case there are other scalings?
for gcd2 call this uvgcd, following Zeng. (This returns more than d, so it should have a different name)

Or,

Maybe have a uvgcd and a keyword argument to gcd to distinguish returning the Euclidean algorithm, the EA with scaling; or the d from zeng

--J

On Wed, Jul 15, 2020 at 7:25 AM Andreas Varga notifications@github.com wrote:

I would appreciate if you can do it.

john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 13:01:

Nice! I can package into a PR if you want or you can. Just let me know.

On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga notifications@github.com wrote:

I implemented two new prototype GCD's, which produce better accuracy gcd's.

The first is just an enhanced Euclidian algorithms, using some internal scaling:

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor d of two polynomials a and

b using

the Euclidian Algorithm with scaling as of [1].

[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to

ill-conditioned

algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

na1 <= 0 && (return ones(R,1))

nb1 <= 0 && (return ones(R,1))

a1, b1, = promote(a[1:na1], b[1:nb1], R[])

a1 = a1/norm(a1)

b1 = b1/norm(b1)

atol == 0 ? tol = rtol : tol = atol

determine the degree of GCD as the nullity of the Sylvester matrix

this computation can be replaced by simply setting nd = 1, in which

case the Sylvester matrix is not formed

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)])

while na1 > nd

na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))

sc = 1

@inbounds for i in na1:-1:nb1

s = -a1[i] / b1[nb1]

sc = max(sc,abs(s))

@inbounds for j in 1:nb1-1

a1[i-nb1+j] += b1[j] * s

end

end

a1 = a1/sc

na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])

na1 === nothing && (na1 = 0)

a1 = a1[1:na1]

end

return nb1 == 1 ? [one(R)] : b1

end

The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.

function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

gcd2(a,b; atol = 0, rtol) -> (d, v, w)

Compute the greatest common divisor d of two polynomials a and

b, and

the polynomials v and w such that a = d v and b = d w, using a

SVD-based method

adapted from [1] (without iterative accuracy refinement).

[1] Z. Zeng, The numerical greatest common divisor of univariate

polynomials,

in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

Randomization, Relaxation,and Complexity in Polynomial Equation

Solving,

Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

a1, b1, = promote(a[1:na1], b[1:nb1], R[])

na1 <= 0 && (return ones(R,1), a1, b1)

nb1 <= 0 && (return ones(R,1), a1, b1)

atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol

determine the degree of GCD as the nullity of the Sylvester matrix

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)], a1, b1)

determine [w; -v] from an orthogonal/unitary nullspace basis of

dimension 1

of a reduced Sylvester matrix using the last row of Vt from its

(partial) SVD

wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]

eltype(wv) <: Complex && (wv = conj(wv))

v = -wv[nb1-nd+1:end]

w = wv[1:nb1-nd]

determine the GCD d with high accuracy as the solution of a

well-conditioned

linear least-squares problem

d = convmtx(w,nd+1)\b1

return d, v, w

end

Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).

function convmtx(v::AbstractVector{T}, n::Int) where T

Convolution matrix.

C = convmtx(v,n) returns the convolution matrix C for a vector v.

If q is a column vector of length n, then C*q is the same as

conv(v,q).

Form C as the Toeplitz matrix

C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code

inline

nv = length(v)-1

C = zeros(T, n+nv, n)

@inbounds for j = 1:n

C[j:j+nv,j] = v

end

return C

end

I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA

.

-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658702675 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALJDHEF5W5SH6MSD4IC2653R3WECFANCNFSM4OOYT3IA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658712574, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADG6THKYONRZBEKVKN3E4LR3WGY3ANCNFSM4OOYT3IA .

-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu

jverzani commented 4 years ago

I had this code in PolynomialZeros that is related. It needed some updating for the newer Polynomials. It works a little harder to identify the degree when there are multiplicities. Would you mind kicking it around to see if it works for your use cases? If so, it would be nice to merge the two.

module AGCD
using Polynomials
using LinearAlgebra
#using SparseArrays # didn't get boost I thought we might with JF

## This provide AGCD.agcd for finding an *approximate* GCD of two polynomials. The most common use in this package
## is to reduce a polynomial `p` to a square free polynomial `q=p/gcd(p, p')`.

## This isn't too accurate for higher order polys, but
## seems to be okay now for degree 20 or less
## Can we do better?
## Follow up on these:
# * [Winkler&Hasan](http://ac.els-cdn.com/S0377042712003135/1-s2.0-S0377042712003135-main.pdf?_tid=126266d0-ea31-11e6-a1a7-00000aab0f6b&acdnat=1486140896_2119601f37cf58ff0d3eefbc05fecbb4)
# * [LiLiuZhi](http://www.sciencedirect.com/science/article/pii/S0377042707000271?np=y&npKey=f5e3b7a1bc9583214348aad066170ea389212ebba443454938cacfdbf96177f8)
# * [Winkler](http://ml.dcs.shef.ac.uk/summer_school/ln.pdf)
# * [Matonardi&...](http://www.sciencedirect.com/science/article/pii/S0024379515005649)
# * [FarmerPhD](http://www.learninglink.bbk.ac.uk/site/assets/files/1025/farmer.pdf)
# * [WinklerLaoHasan](http://www.sciencedirect.com/science/article/pii/S0377042712000751)
# * [Sagraloff&...](https://arxiv.org/pdf/1308.4088v2.pdf)
# * [](https://arxiv.org/pdf/1207.0630v3.pdf)

## ------------------------

# we use vectors, not polynomials. Some of this if copied and modified from Polynomials.jl
# originally by @vtjnash
_degree(p::Vector) = length(p) - 1
function _polyval(p::Vector{T}, x::S) where {T, S}
    R = promote_type(T,S)
    y = convert(R, p[end])
    for i in lastindex(p)-1:-1:1
        y = p[i] + x * y
    end
    y
end
function _monic!(ps::Vector{T}) where T
    lambda = iszero(ps[end]) ? one(T) : 1 / ps[end]
    ps .= ps .* lambda
end
function _monic(p::Vector)
    p1 = copy(p)
    _monic!(p1)
    p1
end

function _polymul(p::Vector{T}, q) where {T}
    R = promote_type(T,eltype(q))
    n = length(p)-1
    m = length(q)-1
    a = zeros(R, m+n+1)

    for i = 1:n+1
        for j = 1:m+1
            a[i+j-1] += p[i] * q[j]
        end
    end
    a
end

function _polyder(p::Vector{T}) where {T}
    S = eltype(float(p[1]))
    n = length(p)
    a2 = Vector{S}(undef, n-1)
    for i = 1:n-1
        a2[i] = p[i+1] * i
    end
    a2
end

## ------------------------

## Special matrices
## p = [p_0, p_1, ..., p_n]
function cauchy_matrix_size(p::Vector, k::Integer)
    n = length(p)
    (n + k - 1, k)
end
function cauchy_matrix!(M, p::Vector{T}, k::Integer) where {T}
    n = length(p)
    for i in 1:k
        M[(1:n) .+ (i-1), i] = p
    end
end
function cauchy_matrix(p::Vector{T}, k::Integer) where {T}
    n,m = cauchy_matrix_size(p, k)
    out = zeros(T, n, m)
    cauchy_matrix!(out, p, k)
    out
end

## Jacobian F(u,v,w) = [p,p'] is J(u,v,w)
function JF_size(u, v, w)

    m, k, j = _degree(u), _degree(v), _degree(w)
    n, l = m + k, m + j

    ai, aj = cauchy_matrix_size(v, m + 1)
    bi, bj = cauchy_matrix_size(u, k + 1)
    di, dj = cauchy_matrix_size(w, m + 1)
    fi, fj = cauchy_matrix_size(u, j + 1)
    ci, cj = ai, fj
    ei, ej = di, bj

    (1 + ai + di, aj + bj + cj)

end

function JF!(M, u::Vector{T}, v, w) where {T}
    m, k, j = _degree(u), _degree(v), _degree(w)
    n, l = m + k, m + j

    # JF_size should return these
    ai, aj = cauchy_matrix_size(v, m + 1)
    bi, bj = cauchy_matrix_size(u, k + 1)
    di, dj = cauchy_matrix_size(w, m + 1)
    fi, fj = cauchy_matrix_size(u, j + 1)
    ci, cj = ai, fj
    ei, ej = di, bj

    M[1,1] = one(T)
    cauchy_matrix!(view(M, 1 .+ (1:ai), 1:aj),        v, m + 1)
    cauchy_matrix!(view(M, 1 .+ (1:bi), aj .+ (1:bj)), u, k + 1)
#    M[1 + (1:ci), aj + bj + (1:cj)] = zeros(T, n+1, m+1-j)
    cauchy_matrix!(view(M, 1 + ai .+ (1:di), 1:dj),   w, m + 1)
#    M[1 + bi + (1:ei), dj + (1:ej)] = zeros(T, m + 1, n + 1 - j)
    cauchy_matrix!(view(M, 1 + ci .+ (1:fi), dj + ej .+ (1:fj)), u, j + 1)

end
function JF(u::Vector{U}, v::Vector{V}, w::Vector{W}) where {U,V,W}
    R = promote_type(U,V, W)
    M = zeros(R, JF_size(u, v, w)...)
    JF!(M, u, v, w)
    M
end

# eqn (12) to get weights
function _weights(p::Vector{T},q) where {T}
    n, m = length(p), length(q)
    wts = ones(T, 1 + n + m)
    for (i,pj) in enumerate(p)
        wts[1+i] = min(one(real(T)), inv(abs(pj)))
    end
    for (i,qj) in enumerate(q)
        wts[1 + n + i] = min(one(real(T)), inv(abs(qj)))
    end
    wts
end

# reduce residual by Gauss-Newton algorithm
# updates u,v,w; returns residual error and flag for convergence
function reduce_residual!(u,v,w, p::Vector{T}, q, wts, ρ) where {T}
    m, n, l = map(_degree, (u, v, w))
    # preallocate
    #R = big(T)
    R = T
    A = zeros(R, JF_size(u, v, w)...)
    b = zeros(R, 1 + length(p) + length(q))
    inc = zeros(R, m + n + l + 3) #weighted_least_square(A, b, wts)
    up,vp,wp = copy(u),copy(v),copy(w)

    ρm, ρm_1 = one(T), residual_error(p,q,u,v,w, wts)
    MAXSTEPS = 10
    ctr = 0
    flag = :not_converged

    rho_0 = ρm_1

    while ctr <= MAXSTEPS

        ρm =  agcd_update!(p, q, u,v,w, m, n, A, b, inc, up, vp, wp, wts, ρm_1)

        if ρm <  ρm_1
            ctr += 1
            ρm_1 = ρm
        else
            break
        end
    end

    # threshold here is really important, but not clear what it should be.
    # We empirically get better results---it seems---with sqrt(norm(p,2))
    # though recomendation if norm(u,2); Might also make sense to include
    # q in the computation
    if ρm <= ρ * norm(u,2)
        flag = :converged
    end
    if  ρm < rho_0
    end
    return (ρm, flag)
end

## Lemma 4.1, (25)
## solve weighted least squares problem W*(Ax - b) = 0
function weighted_least_square(A, b, w)
    W = diagm(0 => w)
    (W * A) \ (W * b)
end

function weighted_least_square!(M, A, b, w)
    W = diagm(0 => w)
    M[:] .= (W * A) \ (W * b)
end

## compute F(u,v,w) - [p, p'] = [u*v, u*w] - [p, p']
Fmp(p,q,u,v,w) =  [u[end]; _polymul(u,v); _polymul(u,w)] .- [1; p; q]
function Fmp!(b, p,q,u,v,w)
    b[1] = u[end] - 1
    offset = 1
    for (i,v) in enumerate(_polymul(u,v) .- p)
        b[offset + i] = v
    end
    offset = 1 + length(p)
    for (i,v1) in enumerate(_polymul(u,w) .- q)
        b[offset + i] = v1
    end
end

residual_error(p,q,u,v,w, wts=ones(1 + length(p) + length(q))) = norm( ([_polymul(u,v); _polymul(u,w)] .- [p; q]) .* wts[2:end], 2)

### refine u,v,w estimate using
## Newton's method and weighted least squares
# updates in place u,v,w
function agcd_update!(p, q, u,v,w,m,n, A, b, inc, up, vp, wp, wts, err0)

    JF!(A, u,v,w)
    Fmp!(b, p,q,u,v,w)

    weighted_least_square!(inc, A, b, wts)

    Δu = inc[1:(1+m)]
    Δv = inc[(m+2):(m+n+2)]
    Δw = inc[(m+n+3):end]

    up .= u; vp .= v; wp .= w

    up .-=  Δu; _monic!(up)
    vp .-=  Δv; _monic!(vp)
    wp .-=  Δw; _monic!(wp)

    err = residual_error(p,q,up,vp,wp, wts)
    # update if we improve
    if err  < err0
        u .= up; v .= vp; w .= wp

    else
        err = err0
    end

    # return error estimate

    err
end

# used below
function _mult(Gs, A)
    for G in Gs
        A = G * A
    end
    A
end

function qr_sylvester!(Gs, A0, k)

        A = copy(A0)
        m,n = size(A)
        for i in m:-1:2
            Gi, r = givens(A[1,1],A[i,1], 1, i)
            A = Gi * A
            push!(Gs, Gi)  # [G1, G2, ..., Gn]
        end

        for i in m:-1:3
            Gi, r = givens(A[2,2],A[i,2], 2, i)
            A = Gi * A
            push!(Gs, Gi)
        end
        return A[1:2,1:2]
end

function qr_sylvester!(Gs, A0, k, R)
    T = eltype(A0)
    N = (k-1)
    Zs = zeros(T, N)
    B = vcat(hcat(Zs, Zs), A0)
    B = _mult(Gs, B)
    m,n = size(B)
    for i in m:-1:(2k)
        Gi, r = givens(B[2(k-1)+1, 1],   B[i, 1], 2(k-1)+1, i)
        B = Gi * B
        push!(Gs, Gi)
    end
    for i in m:-1:(2k+1)
            Gi, r = givens(B[2k, 2], B[i, 2], 2k, i)
        B = Gi * B
        push!(Gs, Gi)
    end
    return [vcat(R, zeros(T, 2, 2*(k-1))) B[1:(2k),:]]
end

# return sigma, x
# converge on smallest eigenvalue of (A'*A) using power rule
# A=QR; (A'*A)^-1) = (R'*Q'*Q*R)^(-1) = (R'*R)^(-1)
# instead of computing x_{i+1} = (R'*R)^{-1} xi; we solve R'y=xi; Rz=y;x=z/|z|
function smallest_eigval(R::LinearAlgebra.UpperTriangular{T},
                         thresh=sqrt(eps(real(T)))) where {T}
    k = size(R)[1]/2

    if iszero(det(R))
        return (:iszero, zero(T), T[])
    end

    m,n = size(R)

    x = ones(T, n)
    y = zeros(T, m)
    z = zeros(T, n)
    σ, σ1 = Inf*one(real(T)), Inf*one(real(T))
    flag = :ispositive

    for cnt in 1:10
        y .= R' \ x
        z .= R  \ y
        nz = 1/norm(z,2)
        x .= z .* nz
        sigma = norm(R * x, 2)
        σ1 = abs(sigma)

        if σ1 < 0.99 *  σ
            σ = σ1
            continue
        end

        if σ1 < thresh
            flag = :ispossible
            break
        end
        if  abs(σ - σ1)  < 1.1 * σ1
            flag = :ispossible
            break
        end

        σ = σ1
    end

    return (flag, σ1, x)
end

## return k, sigma for possible k values
function sylvester_matrix_singular_values(p::Polynomial, q::Polynomial=polyder(p);kwargs...)
    sylvester_matrix_singular_values(Polynomials.coeffs(p), Polynomials.coeffs(q); kwargs...)
end

function sylvester_matrix_singular_values(ps::Vector{T},
                                          qs::Vector{S}=_polyder(ps),
                                          θ=sqrt(eps(real(T)))) where {T,S}
    _monic!(ps); _monic!(qs)
    n, m = length(ps), length(qs)
    wts = _weights(ps,qs)

    R = promote_type(T,S)
    A0::Matrix{R} = R[ps vcat(zeros(T, n-m),qs)]
    Gs = LinearAlgebra.Givens{T}[]
    k = 1
    sigmas = R[]
    R = qr_sylvester!(Gs, A0, k)

    local x::Vector{T}

    while k <= m-1
        V = UpperTriangular(R)
        flag, sigma, x = smallest_eigval(V, θ * norm(ps,2))
        push!(sigmas, sigma)
        k += 1
        R =  qr_sylvester!(Gs, A0, k, R)
    end
    return [1:(k-1) sigmas]
end

# return R for sylvester matrix of size k
function sylvester_matrix_k(ps::Vector{T},
                            K) where {T}
    qs = _polyder(ps)
    _monic!(ps); _monic!(qs)
    n, m = length(ps), length(qs)
    wts = _weights(ps,qs)

    A0::Matrix{T} = [ps vcat(zeros(T, n-m),qs)]
    Gs = LinearAlgebra.Givens{T}[]
    R = qr_sylvester!(Gs, A0, 1)
    k = 2
    while k <= K
        R =  qr_sylvester!(Gs, A0, k, R)
        k += 1
    end
    V = UpperTriangular(R)
end

"""
    agcd(p, q, θ=sqrt(eps(real(T))), ρ=cbrt(eps(real(T))) * θ)

Find an approximate GCD for polynomials `p` and `q` using an algorithm of [Zeng](https://doi.org/10.1090/S0025-5718-04-01692-8).

Returns u,v,w, err where:

* `u*v ≈ monic(p)`
* `u*w ≈ monic(q)`
* The total residual error in these approximations is bounded by `err`.

Further,

* `v` and `w` should share no common roots (`u` is a gcd of `u*v` and `u*w`)
* the roots of `v` should exhaust the unique values of roots of `p`.

(If `p` and `q` are specified as vectors, the returned values will be
vectors. If specified as objects of `Polynomial` type then the
returned values will be as well.)

The tolerances are:

* θ: singular value threshold. Used to identify if smallest singular value of Sylvester matrix is ≈ 0. We use `sqrt(eps(T))`, close to that of Zeng's `1e-8`, but more flexible should `BigFloat` values be used.

* ρ: initial residual tolerance. If we can get (u,v,w) error less than this, we stop. Zeng uses `1e-10`, we use as a default the smaller `eps(T)^(5/6)`.

The algorithm looks for the first `k` for which the corresponding
Sylvester matrix is rank deficient. This follows Lemma 2.4 of the paper, which
finds the smallest singular value. In the process, an estimate for
`u`,`v`, and `w` is produced. If this singular value is approximately
0 (as determined with the parameter θ) *and* the estimates of the
decomposition can be refined via a Gauss-Jordan process to a close
enough approximation (as determined with the parameter ρ) then the
desired values are found.

The act of identifying the value of `k` depends on determining if the
smallest singular value is 0 which in practice is quite hard. The
simple threshold of `norm(p,2) ⋅ θ` will fail for higher-degree
polynomials. For manually identifying this value, the complete set of
estimated singular values for different values of `k` are returned by
`sylvester_matrix_singular_values`. For a given `k` the refined values
of `u`, `v`, and `w` are returned by `rank_k_agcd`.

"""
function agcd(ps::Vector{T}, qs::Vector{S}=_polyder(ps);
              θ=sqrt(eps(real(T))),
              ρ=cbrt(eps(real(T))) * θ,
              maxk=length(qs)) where {T,S}
    _monic!(ps); _monic!(qs)
    n, m = length(ps), length(qs)
    wts = _weights(ps,qs)

    U = promote_type(T,S)
    A0::Matrix{U} = U[ps vcat(zeros(U, n-m),qs)]

    nm = norm(ps, 2)

    thresh = nm * θ ## this is sensitive

    Gs = LinearAlgebra.Givens{T}[]
    k = 1
    R = qr_sylvester!(Gs, A0, k)

    local x::Vector{U}
    while k <=  maxk
        V = UpperTriangular(R)
        flag, sigma, x = smallest_eigval(V, thresh)

        if flag == :iszero
            # exact zero means we need to identify initial guess for u,v,w
            u = qs
            w = ones(U, 1)
            v = cauchy_matrix(u, length(ps) - m + 1) \ ps
            _monic!(v)

            return (u,v,w, zero(U))

        end

        if flag != :iszero &&  sigma < thresh
            v = x[2:2:end]; _monic!(v)
            w = x[(1+2(n-m)):2:end]; _monic!(w)
            A =  cauchy_matrix(v, n - length(v) + 1)
            u = A \ ps
            _monic!(u)

            ρm, flag = reduce_residual!(u, v, w, ps, qs, wts, ρ)

            if flag == :converged || flag == :not_updated
                return (u, v, w, ρm)
            end

        end

        # else bumpup k
        k += 1
        if k > m
            u = ones(U,1)
            w = qs
            v = ps
            return (u, v, w, zero(U))
        end
        R =  qr_sylvester!(Gs, A0, k, R)
    end

    # Okay, we gave it a good shot, but didn't find a perfect candidate
    # let's move on...
    v = x[2:2:end]; _monic!(v)
    w = x[3:2:end]; _monic!(w)
    A =  cauchy_matrix(v, n - length(v) + 1)
    u = A \ ps
    _monic!(u)

    wts = _weights(ps,qs)
    ρm, flag = reduce_residual!(u,v,w, ps, qs, wts, ρ)
    return (u, v, w, ρm)

end

function agcd(p::Polynomial, q::Polynomial=polyder(p); kwargs...)
    u,v,w,rho = agcd(Polynomials.coeffs(p), Polynomials.coeffs(q); kwargs...)
    (Polynomial(u), Polynomial(v), Polynomial(w), rho)
end

## --------------
## preconditioning code
## taken from https://who.rocq.inria.fr/Jan.Elias/pdf/je_masterthesis.pdf

## compute q(x) = p(phi*x)
function Tphi!(p, ps::Vector{T}, phi) where {T}
    p .= ps .* (phi^(i-1) for i in eachindex(ps))
end
function Tphi(ps::Vector{T}, phi) where {T}
    p = copy(ps)
    p .= ps .* (phi^(i-1) for i in eachindex(ps))
    p
end
## c
## compute q(x) = p(x-alpha)
function T_alpha(ps::Vector{T}, alpha) where {T}
end

function geometric_mean(a::Vector{T})::T where {T}
    b = filter(!iszero, a)
    n = length(b)
    prod(abs(u)^(one(T)/n) for u in b)
end

# compute max(max(ps), max(qs)) / min(min(ps), min(qs))
function ratio(p::Vector{T}, q::Vector{T}) where {T}
    m,M = Inf*one(real(T)), zero(real(T))
    for ps in (p, q)
        for a in ps
            aa = abs(a)
            if 0 < aa < m
                m = aa
            end
            if aa > M
                M = aa
            end
        end
    end

    iszero(m) ? Inf*one(T) : M/m

end

## find alpha, gamma that minimize ratio of max coefficients to min coefficients, for getting zeros
## 1.12 writes this as a linear programming problem, we just ...
function precondition(p::Vector{T}, q::Vector) where {T}

    S = real(T)
    m = ratio(p,q)

    alphas = ((2*one(S))^i for i in -5:5)
    phis = ((2*one(S))^i for i in -5:5)

    out = (one(S), one(S))

    for α in alphas
        for ϕ in phis
            r = ratio(Tphi(p, ϕ), α * Tphi(q, ϕ))
            if r < m
                out = (α, ϕ)
            end
        end
    end

    α, ϕ = out

    p = Tphi(p, ϕ)
    q = α * Tphi(q, ϕ)

    p = p * (1/geometric_mean(p))
    q = q * (1/geometric_mean(q))

    p, q, ϕ, α

end

end

andreasvarga commented 4 years ago

For gcd2 the name could be gcdvw. In the meantime I implemented the iterative refinement too, but without structure exploiting. This is now a separate function called gcdupd and performs a specified maximum number of iteration (until a residual decreses). It would be possible to include it in gcd2.

As for gcd1, this function occasionally produces poor accuracy results. This is because the long division algorithm for polynomials is not numerically stable. I don't understand the effect of scaling. Other scaling schemes could be also used.

john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 14:22:

Do you have any thoughts on this naming scheme:

for gcd1 add a scale=true keyword argument to gcd to call gcd1. Maybe this should be scale=:noda_sasaki in case there are other scalings?

for gcd2 call this uvgcd, following Zeng. (This returns more than d, so it should have a different name)

Or,

Maybe have a uvgcd and a keyword argument to gcd to distinguish returning the Euclidean algorithm, the EA with scaling; or the d from zeng

--J

On Wed, Jul 15, 2020 at 7:25 AM Andreas Varga notifications@github.com wrote:

I would appreciate if you can do it.

john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 13:01:

Nice! I can package into a PR if you want or you can. Just let me know.

On Wed, Jul 15, 2020 at 6:56 AM Andreas Varga < notifications@github.com> wrote:

I implemented two new prototype GCD's, which produce better accuracy gcd's.

The first is just an enhanced Euclidian algorithms, using some internal scaling:

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real

Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor d of two polynomials a and

b using

the Euclidian Algorithm with scaling as of [1].

[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application

to ill-conditioned

algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

na1 <= 0 && (return ones(R,1))

nb1 <= 0 && (return ones(R,1))

a1, b1, = promote(a[1:na1], b[1:nb1], R[])

a1 = a1/norm(a1)

b1 = b1/norm(b1)

atol == 0 ? tol = rtol : tol = atol

determine the degree of GCD as the nullity of the Sylvester matrix

this computation can be replaced by simply setting nd = 1, in which

case the Sylvester matrix is not formed

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)])

while na1 > nd

na1 < nb1 && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))

sc = 1

@inbounds for i in na1:-1:nb1

s = -a1[i] / b1[nb1]

sc = max(sc,abs(s))

@inbounds for j in 1:nb1-1

a1[i-nb1+j] += b1[j] * s

end

end

a1 = a1/sc

na1 = findlast(t -> (abs(t) > tol),a1[1:nb1-1])

na1 === nothing && (na1 = 0)

a1 = a1[1:na1]

end

return nb1 == 1 ? [one(R)] : b1

end

The second function is usually more robust and computes additionally the factors v = a/d and w = v/d.

function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real

Base.rtoldefault(float(real(T)))) where {T <: Number, S <: Number }

Greatest common divisor of two polynomials.

gcd2(a,b; atol = 0, rtol) -> (d, v, w)

Compute the greatest common divisor d of two polynomials a and

b, and

the polynomials v and w such that a = d v and b = d w, using a

SVD-based method

adapted from [1] (without iterative accuracy refinement).

[1] Z. Zeng, The numerical greatest common divisor of univariate

polynomials,

in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

Randomization, Relaxation,and Complexity in Polynomial Equation

Solving,

Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

na1 = findlast(!iszero,a) # degree(a) + 1

na1 === nothing && (na1 = 0)

nb1 = findlast(!iszero,b) # degree(b) + 1

nb1 === nothing && (nb1 = 0)

R = eltype(one(T)/one(S))

a1, b1, = promote(a[1:na1], b[1:nb1], R[])

na1 <= 0 && (return ones(R,1), a1, b1)

nb1 <= 0 && (return ones(R,1), a1, b1)

atol == 0 ? tol = rtol*max(maximum(abs.(a1)),max(1,maximum(abs.(b1)))) : tol = atol

determine the degree of GCD as the nullity of the Sylvester matrix

nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)

nd == 0 && (return [one(R)], a1, b1)

determine [w; -v] from an orthogonal/unitary nullspace basis of

dimension 1

of a reduced Sylvester matrix using the last row of Vt from its

(partial) SVD

wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:]

eltype(wv) <: Complex && (wv = conj(wv))

v = -wv[nb1-nd+1:end]

w = wv[1:nb1-nd]

determine the GCD d with high accuracy as the solution of a

well-conditioned

linear least-squares problem

d = convmtx(w,nd+1)\b1

return d, v, w

end

Both functions call convmtx to construct an appropriate convolution matrix (a similar function is also available in MATLAB).

function convmtx(v::AbstractVector{T}, n::Int) where T

Convolution matrix.

C = convmtx(v,n) returns the convolution matrix C for a vector v.

If q is a column vector of length n, then C*q is the same as

conv(v,q).

Form C as the Toeplitz matrix

C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code

inline

nv = length(v)-1

C = zeros(T, n+nv, n)

@inbounds for j = 1:n

C[j:j+nv,j] = v

end

return C

end

I performed a lot of tests with large order randomly generated examples and several examples from the literature. I am using for my purposes gcd2, which is more accurate and better suited for LCM computation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658700384

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AADG6TDLWMIKZABCGEIYGWLR3WDNXANCNFSM4OOYT3IA

.

-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658702675

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALJDHEF5W5SH6MSD4IC2653R3WECFANCNFSM4OOYT3IA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658712574 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AADG6THKYONRZBEKVKN3E4LR3WGY3ANCNFSM4OOYT3IA

.

-- John Verzani Department of Mathematics College of Staten Island, CUNY verzani@math.csi.cuny.edu

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658735621, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHECY3S3JU3XFTFT6E3LR3WNR3ANCNFSM4OOYT3IA .

jverzani commented 4 years ago

Will you incorporate the work into gcd2? If so, I can wait on the PR I was working on.

andreasvarga commented 4 years ago

I propose you go ahead with the PR. I still have a small change to include in gcd2 as proposed in the cited paper, namely, to include both v and w in the computation of the common divisor. I will send you the modified single line tomorrow.

john verzani notifications@github.com schrieb am Mi., 15. Juli 2020, 20:48:

Will you incorporate the work into gcd2? If so, I can wait on the PR I was working on.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaMath/Polynomials.jl/issues/240#issuecomment-658939502, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHED4BFFTGOIQFNNSYWLR3X2WHANCNFSM4OOYT3IA .

andreasvarga commented 4 years ago

The old line in gcd2 is commentd out:


    #d = convmtx(w,nd+1)\b1 
    d = [convmtx(v,nd+1) ; convmtx(w,nd+1)] \ [a1; b1]

An issue on which I am still reflecting is related to scaling of the original polynomials. The code works if we replace a by a/sa and b with b/sb, wheresa = norm(a) and sb = norm(b). Such a scaling is meaningfull to compensate big differences in the range of coefficients of a and b. The resulting v and w are then scaled factors of a and b, respectively, with the nice property norm([v;w]) = 1. I wonder if it would be wise to include scaling and to provide v and w as now, instead ofv*sa and w*sb, respectively.

andreasvarga commented 4 years ago

I implemented a version with scaling and optional iterative refinement (still no explicit structure exploition). Here is the whole collection:

function poldeg1(v::Vector{T}) where {T <: Number}
    # Degree of a polynomial plus one.
    n1 = findlast(!iszero,v) # degree(v) + 1
    n1 === nothing && (n1 = 0)
    return n1
end
poldeg(v::Vector{T}) where {T <: Number} = poldeg1(v)-1  # Degree of a polynomial
function convmtx(v::Vector{T}, n::Int) where {T <: Number}

    #   Convolution matrix.
    #   C = convmtx(v,n) returns the convolution matrix C for a vector v. 
    #   If q is a column vector of length n, then C*q is the same as conv(v,q). 

    #   Form C as the Toeplitz matrix 
    #   C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1));  put Toeplitz code inline

    nv = length(v)-1
    C = zeros(T, n+nv, n)
    @inbounds for j = 1:n
        C[j:j+nv,j] = v  
    end
    return C
end
function polgcdvw(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real = Base.rtoldefault(float(real(T))), maxnit::Int = 0) where {T <: Number, S <: Number }
    # Greatest common divisor of two polynomials.
    #     polgcdvw(a, b; atol = 0, rtol, maxnit = 0) -> (d, v, w, δ)
    # Compute the greatest common divisor `d` of two polynomials `a` and `b`, and 
    # the polynomials v and w such that a = d * v * ka and b = d * w *kb, where ka and kb
    # are scalar scaling factors (ka and kb are not explicitly provided). 
    # The accuracy estimate δ is computed as δ = norm([a/ka-d*v; b/kb-d*w]). 
    # A SVD-based method adapted from [1] is employed and an iterative accuracy refinement
    # of δ, with a maximum number of maxnit iterations, is additionally performed.
    #  [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials, 
    #      in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.), 
    #      Randomization, Relaxation,and Complexity in Polynomial Equation Solving, 
    #      Contemporary Mathematics,vol.556,AMS,2011,pp.187–217.

    na1 = poldeg1(a) # degree(a) + 1
    nb1 = poldeg1(b) # degree(b) + 1

    R = eltype(one(T)/one(S))

    a1, b1,  = promote(a[1:na1], b[1:nb1], R[])
    na1 <= 0 && (return ones(R,1), zeros(R,1), b1, zero(R))
    nb1 <= 0 && (return ones(R,1), a1, zeros(R,1), zero(R))
    a1 = a1/norm(a1)
    b1 = b1/norm(b1)
    switch = (na1 < nb1)
    switch && ((a1, b1, na1, nb1) = (b1, a1, nb1, na1))

    atol == 0 ? tol = rtol : tol = atol
    # determine the degree of GCD as the nullity of the Sylvester matrix 
    nd = na1 + nb1 - 2 - rank([convmtx(a1,nb1-1) convmtx(b1,na1-1)], atol = tol)
    nd == 0 && (switch ? (return [one(R)], b1, a1, zero(R)) : (return [one(R)], a1, b1, zero(R))) 
    # determine [w; -v] from an orthogonal/unitary nullspace basis of dimension 1 
    # of a reduced Sylvester matrix using the last row of Vt from its (partial) SVD 
    wv = LAPACK.gesvd!('N', 'A', [convmtx(a1,nb1-nd) convmtx(b1,na1-nd)])[3][end,:] 
    eltype(wv) <: Complex && (wv = conj(wv))  
    v = -wv[nb1-nd+1:end]
    w = wv[1:nb1-nd]
    # determine the GCD d with high accuracy as the solution of a well-conditioned 
    # linear least-squares problem
    #d = convmtx(w,nd+1)\b1 
    d = [convmtx(v,nd+1) ; convmtx(w,nd+1)] \ [a1; b1]
    if maxnit > 0
       d, v, w, δ = gcdvwupd(a1, b1, d, v, w, maxnit = maxnit)
    else
       δ = norm( [a1; b1] - [ conv(v,d); conv(w,d) ])
    end

    switch ? (return d, w, v, δ) : (return d, v, w, δ)
end 
function gcdvwupd(a::Vector{T}, b::Vector{T}, d::Vector{T}, v::Vector{T}, w::Vector{T}; maxnit::Int = 10) where {T <: Number}
    # Iterative refinement of the accuracy of the greatest common divisor of two polynomials.
    #     gcdvwupd(a, b, d, v, w; maxnit = 0) -> (dupd, vupd, wupd, δ)
    # Given the polynomials a and b, and a triple of polynomials (d, v, w) such that 
    # `d` is an approximation of a greatest common divisor of `a` and `b`, and 
    # v and w are approximate quotients of a/d and b/d, respectively, compute the updated triple 
    # (dupd, vupd, wupd) which minimizes the accuracy estimate δ = norm([a-dupd*vupd; b-dupd*wupd]).
    # A maximum number of maxnit iterations are performed using the Gauss-Newton iteration method, 
    # in the form proposed in [1].
    #
    #  [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials, 
    #      in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.), 
    #      Randomization, Relaxation,and Complexity in Polynomial Equation Solving, 
    #      Contemporary Mathematics,vol.556,AMS,2011,pp.187–217.

    na1 = length(a) 
    nb1 = length(b) 
    nd1 = length(d)
    nv1 = length(v)
    nw1 = length(w)
    na1 == nd1+nv1-1 || error("Incompatible dimensions between a, d and v")
    nb1 == nd1+nw1-1 || error("Incompatible dimensions between b, d and w")

    h = d/norm(d)^2
    f = [one(T);a;b]
    fh = [dot(h,d); conv(v,d); conv(w,d)]
    Δf = f-fh;
    δ = norm(Δf)
    d0 = copy(d); v0 = copy(v); w0 = copy(w);
    nvw1 = nv1+nw1
    ndvw1 = nd1+nvw1 
    nab1 = na1+nb1
    Jh = [h' zeros(1,nvw1); zeros(nab1,ndvw1)]
    j1 = 1:nd1
    j2 = nd1+1:nd1+nv1
    j3 = nd1+nv1+1:ndvw1
    i2 = 2:na1+1
    i3 = na1+2:nab1+1
    Jh21 = view(Jh,i2,j1)
    Jh31 = view(Jh,i3,j1)
    Jh22 = view(Jh,i2,j2)
    Jh33 = view(Jh,i3,j3)

    for it = 1:maxnit
        # Jh = [h' zeros(1,na1-nd1+1) zeros(1,nb1-nd1+1) 
        #       convmtx(v0,nd1) convmtx(u0,na1-nd1+1) zeros(nd1+nv1-1,nb1-nd1+1) 
        #       convmtx(w0,nd1) zeros(nd1+nw1-1,na1-nd1+1) convmtx(u0,nb1-nd1+1)];
        Jh21[:,:] = convmtx(v0,nd1)
        Jh31[:,:] = convmtx(w0,nd1)
        Jh22[:,:] = convmtx(d0,nv1)
        Jh33[:,:] = convmtx(d0,nw1)
        Δz = Jh\Δf;
        d1 = d0 + Δz[j1];
        # v1 = v0 + Δz[nd1+1:nd1+nv1];
        # w1 = w0 + Δz[end-nw1+1:end];
        v1 = v0 + Δz[j2];
        w1 = w0 + Δz[j3];
        fh1 = [dot(h,d1); conv(v1,d1); conv(w1,d1)];
        Δf1 = f-fh1;
        δ1 = norm(Δf1)
        if δ1 < δ
           d0 = copy(d1); v0 = copy(v1); w0 = copy(w1); δ = copy(δ1); Δf = copy(Δf1);
        else
           break
        end
    end
    return d0, v0, w0, δ 
end
function pollcm(a::Vector{T}, b::Vector{S}; fastgcd::Bool = true, atol::Real = 0, rtol::Real = Base.rtoldefault(real(T)), maxnit::Int = 10) where {T <: Number, S <: Number }
    # Least common multiple of two polynomials.
    #     m = pollcm(a,b; atol = 0, rtol, maxnit = 0) 
    # Compute the least common multiple `m` of two polynomials `a` and `b`. Employ iterative refinement of
    # the accuracy of GCD(a,b) if maxnit > 0. 

    na1 = poldeg1(a) # degree(a) + 1
    nb1 = poldeg1(b) # degree(b) + 1

    na1 == 0 && (return zeros(T,1))
    nb1 == 0 && (return zeros(S,1))
    na1 == 1 && (return b)
    nb1 == 1 && (return a)

    return conv(a, polgcdvw(a, b, atol = atol, rtol = rtol, maxnit = maxnit)[3])
end    
function conv(a::Vector{T}, b::Vector{S}) where {T <: Number, S <: Number }
    # Convolution of two vectors (or product of two polynomials).
    # c = conv(a,b) returns the convolution of vectors a and b.

    na1 = length(a) 
    nb1 = length(b) 

    R = promote_type(T, S)
    c = zeros(R, na1 + nb1 - 1)
    for i in 1:na1, j in 1:nb1
        @inbounds c[i + j - 1] += a[i] * b[j]
    end
    return c
end

jverzani commented 4 years ago

I incorporated the above into the following, which tracks the paper's algorithm. It uses some heuristics for the tolerances, that work out in test cases up to degree ~ 20 (i.e. the gcd of p = (x-1)^n(x-2)^n(x-3); q = (x-1)(x-2)(x-4) is misidentified when n=12). Can you see if this works for your cases or you can offer improvements? Thanks.

module NGCD
using LinearAlgebra

"""
    ngcd(ps::Vector{T}, qs::Vector{T}, ϵ=eps(T))

Return u,v,w, ρ, κ where `u⋅v ≈ ps` and `u⋅w ≈ qs`; ρ is the residual error (`|| [u⋅v,u⋅w] - [ps,qs] ||`); and `κ` a numerical gcd condition number estimate.

The numerical GCD problem as defined in [1]:

(5.4) For (p,q), polynomials of degree m,n, and ϵ > 0, a numerical GCD of (p,q) within ϵ is an exact GCD of (p̃,q̃) (polynomials of degree m,n with gcd degree j) where j is the largest value with θⱼ(p,q) < ϵ (θⱼ is the smallest distance such a pair is to (p,q)). These form an equivalence class, gcdₑ. 

For any (p̂,q̂) with exact gcd of degree j, there is ϵ > 0 such that gcdₑ(p,q) satisfies 

limsup_{(p,q)→(p̂,q̂)} inf{ || (u,v,w) - (û,v̂,ŵ) ||} / || (p,q) - (p̂,q̂) || < κₑ(p,q).

κ is called the numerical GCD condition number.

If an exact gcd exists for (p,q) then this will be in gcdₑ.

The Zeng algorithm proposes a degree for `u` and *if* a triple `(u,v,w)` can be found satisfying the inequality this is returned; otherwise the proposed degree is reduced and the process repeats. If not terminated, at degree 0 a constant gcd is returned.

The initial proposed degree is the first `j` with `j=min(m,n):-1:1` where Sⱼ is believed to have a singular value of 0 (Sⱼ being related to the Sylvester matrix of `ps` and `qs`). Verification of the degree is done using a Gauss-Newton iteration scheme holding the degree of `u` constant.

Reference:

[1] The Numerical Greatest Common Divisor of Univariate Polynomials
by Zhonggang Zeng;
[url](http://homepages.neiu.edu/~zzeng/uvgcd.pdf);
[doi](https://doi.org/10.1090/conm/556/11014)

Note: Based on work by Andreas Varga

"""
function ngcd(ps::Vector{T},
              qs::Vector{T},
              ϵ = eps(real(T))   # backwards error
              ) where {T <: AbstractFloat}

    m, n = length(ps)-1, length(qs)-1 # degrees
    if m < n
        out = ngcd(qs, ps, ϵ)
        return (u=out.u, v=out.w, w=out.v, out.ρ, out.κ)
    end

    # storage
    A0 = zeros(T, m+1, 2)
    A0[:,1] = ps
    A0[end-n:end,2] = qs

    # pre-allocate storage for Sylveser Matrices, S₁, S₂...
    Q = zeros(T, m + n, m + n)
    R = zeros(T, m + n, m + n)
    Sₓ = hcat(convmtx(ps,1),  convmtx(qs, m-n+1))
    local x::Vector{T}

    ## This is a heuristic
    ϵ *= (n+m)
    ρ′ = 2(n * m) * (1 + norm([ps,qs])) * ϵ

    j = n  # We count down Sn, S_{n-1}, ..., S₂, S₁

    F = qr(Sₓ)

    nr, nc = size(Sₓ) # m+1, m-n+2
    Q[1:nr, 1:nr] = F.Q
    R[1:nc, 1:nc] = F.R

    while true

        ϵ′ = ϵ *  sqrt(1 + m - j)
        V = view(R, 1:nc, 1:nc)
        flag, σ, x = smallest_singular_value(V, ϵ′)

        if (flag == :iszero || abs(σ) <= ϵ′)
            u, v, w = initial_uvw(Val(flag), j, ps, qs, x)

            flag == :iszero && return (u=u, v=v, z=w, ρ=zero(U), κ=NaN)

            # p₁ ≈ p₂ uses max(norm(p₁), norm(p₂)) to compare
            # We are comparing [u⋅v; u⋅w] ≈ [p; q]
            # so we treat ρ′ as a *relative* tolerance (unlike paper)
            # and scale (above) by 2(n+m) * || [p;q] || (number of +, * in products; norm [p;q]
            ρ₁, σ₂ = refine_agcd!(u,v,w, ps, qs, ρ′)

            if ρ₁  <= ρ′ 
                return (u=u, v=v, w=w, ρ=ρ₁, κ=σ₂) # (u,v,w) verified
            end

        end

        # reduce possible degree of u and try again with Sⱼ₋₁
        j -= 1
        nr += 1
        nc += 2
        nc > nr && break

        extend_QR!(Q,R, nr, nc, A0) # before Q⋅R = Sⱼ, now Q⋅R = Sⱼ₋₁

    end

    # u is a constant
    u, v, w = initial_uvw(Val(:constant), j, ps, qs, x)
    ρ₁, σ₂ = refine_agcd!(u,v,w, ps, qs, ϵ*sqrt(m))
    return (u=u, v=v, w=w, ρ=ρ₁, κ=1/σ₂)

end

ngcd(ps::Vector{T}, qs::Vector{S}, ϵ=eps()) where {T, S} = ngcd(promote(float.(ps), float(qs))..., ϵ)

# return guess at smallest singular value and right sinuglar value, x
# for a right triangular matrix, V
function smallest_singular_value(V::AbstractArray{T,2},
                         θ = sqrt(eps(real(T)))) where {T}
    R = UpperTriangular(V)
    k = size(R)[1]/2

    if iszero(det(R)) # 
        return (:iszero, zero(T), T[])
    end

    m,n = size(R)

    x = ones(T, n)
    y = zeros(T, m)
    σ₀ = σ1 = Inf*one(real(T))
    steps, minᵢ = 1, 5

    while true
        y .= R' \ x # use iteration to converge on singular value
        x .= R  \ y
        x ./= norm(x,2)
        σ₁ = norm(R * x, 2)

        if (σ₁ > θ)
            if (steps <= 50) && (steps <= minᵢ || σ₁ <= 0.95*σ₀) # decreasing, keep going
                σ₀ = σ₁
            else
                return (:constant, σ₁, x)
            end
        else
            return (:ispossible, σ₁, x)
        end

        steps += 1
    end
end

# extend QR to next size
# Q gets a 1 in nc,nc, 0s should be elswhere
function extend_QR!(Q,R, nr, nc, A0)

    #old Q is m x m, old R is n x n; we add to these
    n = nc-2 
    m = nr - 1
    k,l = size(A0)

    # add two columns to R
    # need to apply Q to top part of new columns
    R[nr-k+1:nr, (nc-1):nc] = A0
    R[1:nr-1, (nc-1):nc] = (view(Q, 1:nr-1, 1:nr-1))' *  R[1:nr-1, (nc-1):nc]

    # extend Q with row and column with identity 
    Q[nr,nr] = 1

    # Make R upper triangular using Givens rotations
    for j in nr-1:-1:nc-1
        gj,_ = givens(R[j,nc-1], R[j+1,nc-1], j, j+1)
        rmul!(Q, gj')
        lmul!(gj, R)
    end

    for j in nr-1:-1:nc
        gj,_ = givens(R[j,nc], R[j+1,nc], j, j+1)
        rmul!(Q, gj')
        lmul!(gj, R)
    end

    return nothing
end

## --------------------------------------------------
## Refine u,v,w

## Find u₀,v₀,w₀ from right singular vector
function initial_uvw(::Val{:ispossible}, j, ps, qs, x)

    # Sk*[v;w] = 0, so pick out v,w after appliying permuation
    m,n = length(ps)-1, length(qs)-1
    vᵢ = vcat(2:m-n+2, m-n+4:2:length(x))
    wᵢ = vcat(1, (m-n+3):2:length(x))
    v = x[vᵢ]/x[end]
    w = x[wᵢ]/x[end-1]

    # p194 3.9 C_k(v) u = p or Ck(w) u = q; this uses 10.2
    u = [convmtx(v,j+1); convmtx(w, j+1)] \ [ps; qs]

    return u, v, w
end

function initial_uvw(::Val{:iszero}, j, ps, qs, x)
    n,m = length(ps), length(qs)
    # exact zero means we need to identify initial guess for u,v,w
    u = qs
    w = ones(T, 1)
    v = convmtx(u, n - m + 1) \ ps

    return u,v,w
end

function initial_uvw(::Val{:constant}, j, ps::Vector{T}, qs, x) where {T}
    u = ones(T,1)
    w = qs
    v = ps
    u,v,w
end

# find estimate for σ₂, used in a condition number
function σ₂(J)
    F = qr(J)
    flag, σ, x = smallest_singular_value(F.R)
    1/σ
end

## attempt to refine u,v,w
function refine_agcd!(u::Vector{T}, v, w, p, q, ρ=eps(T), minᵢ=10) where {T}

    m, n, l =  _degree.((u, v, w))

    ρₒ, ρ₁ = one(T), residual_error(p, q, u, v, w)
    ρ₁ <= ρ && return ρ₁, NaN

    # storage
    A = zeros(T, JF_size(u, v, w)...)
    b = zeros(T, 1 + length(p) + length(q))
    Δf = zeros(T, m + n + l + 3)

    steps = 0

    u′, v′, w′ = copy(u), copy(v), copy(w)
    h, β =  u, norm(u)^2

    κ = NaN

    while true
        # update A, b, then solve A\b
        JF!(A, h, u, v, w)
        Fmp!(b, h, β, p, q, u, v, w)
        Δf .= A \ b

        Δu, Δv, Δw = Δf[1:(m+1)], Δf[(m+2):(m+n+2)], Δf[(m+n+3):end]
        ρ₀, ρ₁ = ρ₁, residual_error(p, q, u-Δu, v-Δv, w-Δw)

        if  (steps <= 50) && (ρ₁ >= ρ) && (steps <= minᵢ || ρ₁ < 0.05 * ρ₀)
            u .-= Δu; v .-= Δv; w .-= Δw
            steps += 1
        else
            F = qr(A)
            flag, σ₂, _ = smallest_singular_value(F.R)
            κ = 1/σ₂
            return ρ₁, κ
        end
    end

end

## Jacobian F(u,v,w) = [p,p'] is J(u,v,w)
function JF_size(u, v, w)

    m, k, j = _degree(u), _degree(v), _degree(w)
    n, l = m + k, m + j

    ai, aj = convmtx_size(v, m + 1)
    bi, bj = convmtx_size(u, k + 1)
    di, dj = convmtx_size(w, m + 1)
    fi, fj = convmtx_size(u, j + 1)
    ci, cj = ai, fj
    ei, ej = di, bj

    (1 + ai + di, aj + bj + cj)
end

# Jacobian matrix
function JF(u::Vector{U}, v::Vector{V}, w::Vector{W}) where {U,V,W}
    R = promote_type(U,V, W)
    M = zeros(R, JF_size(u, v, w)...)
    JF!(M, u, v, w)
    M
end

function JF!(M, h,  u::Vector{T}, v, w) where {T}
    m, k, j = _degree(u), _degree(v), _degree(w)
    n, l = m + k, m + j

    # JF_size should return these
    ai, aj = convmtx_size(v, m + 1)
    bi, bj = convmtx_size(u, k + 1)
    di, dj = convmtx_size(w, m + 1)
    fi, fj = convmtx_size(u, j + 1)
    ci, cj = ai, fj
    ei, ej = di, bj

    M[1,1:m+1] = h' 
    convmtx!(view(M, 1 .+ (1:ai), 1:aj),                     v, m + 1)
    convmtx!(view(M, 1 .+ (1:bi), aj .+ (1:bj)),             u, k + 1)
    convmtx!(view(M, (1 + ai) .+ (1:di), 1:dj),              w, m + 1)
    convmtx!(view(M, (1 + ci) .+ (1:fi), dj + ej .+ (1:fj)), u, j + 1)

end

## compute F(u,v,w) - [p, p'] = [u*v, u*w] - [p, p']
function Fmp!(b, h, β, p, q, u, v, w)
    b[1] = h'*u - β
    b[2:2+length(p)-1] = u ⊗ v .- p
    b[2+length(p):end] = u ⊗ w .- q
end

function residual_error(p, q, u, v, w)
    norm( [u ⊗ v; u ⊗ w] .- [p; q], 2)
end

## --------------------------------------------------
## utils

_degree(p::Vector) = length(p) - 1

"""
    conv(p,q)

Convolution of two vectors (or product of two polynomials).
Also `⊗`
"""
function conv(p::Vector{T}, q::Vector{T}) where {T}
    R = promote_type(T,eltype(q))
    n = length(p)-1
    m = length(q)-1
    a = zeros(R, m+n+1)

    @inbounds for i = 1:n+1
        for j = 1:m+1
            a[i+j-1] += p[i] * q[j]
        end
    end
    a
end
⊗(p::Vector{T}, q::Vector{T}) where {T} = conv(p,q)

"""
    convmtx(v, n::Int)
    convmtx!(M,v,n)
    convmtx_size(v,n)

Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v. 
If q is a column vector of length n, then C*q is the same as conv(v,q). 

"""

function convmtx!(C, v::AbstractVector{T}, n::Int) where T

    #   Form C as the Toeplitz matrix 
    #   C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1));  put Toeplitz code inline

    nv = length(v)-1

    @inbounds for j = 1:n
        C[j:j+nv,j] = v  
    end

    nothing

end

convmtx_size(v::AbstractVector, n) = (n + length(v) - 1, n)
function convmtx(v::AbstractVector{T}, n::Int) where {T}
    C = zeros(T, convmtx_size(v, n)...)
    convmtx!(C, v, n)
    C
end

end

andreasvarga commented 4 years ago

This is a difficult example, for which an automatic choice of tolerances is very challenging. For a tolerance atol = 1.e-16, the correct result can be obtained with my implementation for the case n = 12.

x = Polynomial([0,1]);
n = 12;
p = coeffs((x-1)^n*(x-2)^n*(x-3));
q = coeffs((x-1)*(x-2)*(x-4)); 
d = coeffs((x-1)*(x-2));  # exact GCD
d2, = polgcdvw(p,q,atol=1.e-16,maxnit = 10); 
roots(Polynomial(d2)) 
@test norm(d*d2[end] - d2*d[end]) / norm(d) < 1.e-7

julia> roots(Polynomial(d2))
2-element Array{Float64,1}:
 1.0
 1.9999999999999991

julia> @test norm(d*d2[end] - d2*d[end]) / norm(d) < 1.e-7
Test Passed

The critical computation is the determination of the nullity of the Sylvester matrix. If this computation fails, then I guess all methods should have the same difficulties to determine a correct GCD.

For example, for n = 20, the absolute tolerance for rank determination must lie between 4.81e-17 and 2.01e-17, which in my opinion is difficult to be chosen automatically.

jverzani commented 4 years ago

Okay, I think this catches the bugs in the last one, and cleans up the interface. You are right some problems are just hard here due to the tolerances, not only with identifying the rank, but also if the proposed triple is close enough. I separated out the two uses, should someone want to adjust them, and tried to identify reasonable relative tolerances. This version does okay on the test cases in the Zeng paper, except for Test 2, which I'm not sure I believe is set up properly. For these problems the scaling of the polynomials used ends up being really important. At one point I had some preconditioning code, but for now scaling is up to the caller.

I think I'll make a PR with this, pending your comments

module NGCD
using LinearAlgebra

"""
    ngcd(ps::Vector{T}, qs::Vector{T}, [k::Int]; atol=eps(T), rtol=eps(T), satol=atol, srtol=rtol)

Return u,v,w, ρ, κ where `u⋅v ≈ ps` and `u⋅w ≈ qs`; Θ (`\\Theta[tab]`) is the residual error (`|| [u⋅v,u⋅w] - [ps,qs] ||`); and `κ` (`\\kappa[tab]`) is the numerical gcd condition number estimate.

The numerical GCD problem is defined in [1] (5.4). Let (p,q) be a
polynomial pair with degree m,n. Let Ρmn be set of all such pairs. Any
given pair of polynomials has an exact greatest common divisor, u, of
degree k, defined up to constant factors. Let Ρᵏmn be the manifold of
all such (p,q) pairs with exact gcd of degree k. A given pair (p,q) with exact gcd of degree j will
have some distance Θᵏ from Pᵏ.  For a given threshold ϵ > 0 a numerical GCD
of (p,q) within ϵ is an exact GCD of a pair (p̂,q̂) in Ρᵏ with 

|| (p,q) - (p̂,q̂) || <= Θᵏ, where k is the largest value for which Θᵏ < ϵ. 

(In the ϵ → 0 limit, this would be the exact GCD.)

Suppose (p,q) is an ϵ pertubation from (p̂,q̂) where (p̂,q̂) has an exact gcd of degree k, then degree(gcdₑ(p,q)) = k; as ϵ → 0, gcdₑ(p,q) → gcd(p̂, q̂), and

limsup_{(p,q)→(p̂,q̂)} inf{ || (u,v,w) - (û,v̂,ŵ) ||} / || (p,q) - (p̂,q̂) || < κₑ(p,q).

κ is called the numerical GCD condition number.

The Zeng algorithm proposes a degree for `u` and *if* a triple `(u,v,w)` with `u` of degree `k` and (u⋅v, u⋅w) in Ρᵏmn   can be found satisfying || (u⋅v, u⋅w) - (p,q) || < ϵ then (u,v,w) is returned; otherwise the proposed degree is reduced and the process repeats. If not terminated, at degree 0 a constant gcd is returned.

The initial proposed degree is the first `j`,  `j=min(m,n):-1:1`, where Sⱼ is believed to have a singular value of 0 (Sⱼ being related to the Sylvester matrix of `ps` and `qs`). The verification of the proposed degree is done using a Gauss-Newton iteration scheme holding the degree of `u` constant.

Tolerances:

There are two places where tolerances are utilized:

* in the identification of the rank of Sⱼ a value σ₁ = ||Rx|| is identified. To test if this is zero a tolerance of `max(satol, ||R||ₒₒ ⋅ srtol)` is used.

* to test if (u ⋅ v, u ⋅ w) ≈ (p,q) a tolerance of `max(atol, λ⋅rtol)` is used with `λ` an estimate for κ⋅||(p,q)||.

Specified degree:

When `k` is specified, a value for `(u,v,w)` is identified with `degree(u)=k`. No tolerances are utilized in computing Θᵏ.

!!! note:
    These tolerances are adjusted from those proposed in [1].

Output:

The function outputs a named tuple with names (`u`, `v`, `w`, `Θ`, `κ`). The components `u`,`v`,w` estimate the gcd and give the divisor. The value `Θ` estimates Θᵏ and `κ` estimates the numerical condition number.

Example:

using Polynomials x = variable(Polynomial{Float64}) p = (x+10)(x^9 + x^8/3 + 1) q = (x+10)(x^9 + x^8/7 - 6/7) gcd(p,q) # u a constant ngcd(coeffs(p),coeffs(q)) # u a degree 1 polynomial


Reference:

[1] The Numerical Greatest Common Divisor of Univariate Polynomials
by Zhonggang Zeng;
[url](http://homepages.neiu.edu/~zzeng/uvgcd.pdf);
[doi](https://doi.org/10.1090/conm/556/11014)

Note: Based on work by Andreas Varga

"""
function ngcd(ps::Vector{T},
              qs::Vector{T};
              atol=(length(ps)+length(qs))*eps(real(T)),  
              rtol=eps(real(T)),
              satol=atol,
              srtol=rtol,
              verbose=false
              ) where {T <: AbstractFloat}

    m, n = _degree.((ps, qs))

    if m < n
        out = ngcd(qs, ps; atol=atol, rtol=rtol, satol=satol, srtol=srtol, verbose=verbose)
        return (u=out.u, v=out.w, w=out.v, Θ=out.Θ, κ=out.κ)
    end

    # storage
    A0 = zeros(T, m+1, 2)
    A0[:,1] = ps
    A0[end-n:end,2] = qs

    # pre-allocate storage for Sylvester Matrices, S₁, S₂...
    Q = zeros(T, m + n, m + n)
    R = zeros(T, m + n, m + n)
    Sₓ = hcat(convmtx(ps,1),  convmtx(qs, m-n+1))
    local x::Vector{T}

    j = n  # We count down Sn, S_{n-1}, ..., S₂, S₁

    F = qr(Sₓ)
    nr, nc = size(Sₓ) # m+1, m-n+2
    Q[1:nr, 1:nr] = F.Q
    R[1:nc, 1:nc] = F.R

    while true

        V = view(R, 1:nc, 1:nc)
        flag, σ, x = smallest_singular_value(V, satol *  sqrt(1 + m - j), srtol)

        verbose && println("------ degree $j ----- σ₁: $σ  --- $flag")

        if (flag == :iszero || flag == :ispossible)

            u, v, w = initial_uvw(Val(flag), j, ps, qs, x)
            flag, ρ₁, σ₂ = refine_agcd!(u,v,w, ps, qs, atol, rtol)

            verbose && println("   --- Θᵏ: $ρ₁ --- $flag")

            if flag == :convergence
                return (u=u, v=v, w=w, Θ=ρ₁, κ=σ₂) # (u,v,w) verified
            end

        end

        # reduce possible degree of u and try again with Sⱼ₋₁
        j -= 1
        nr += 1
        nc += 2
        nc > nr && break

        extend_QR!(Q,R, nr, nc, A0) # before Q⋅R = Sⱼ, now Q⋅R = Sⱼ₋₁

    end

    # u is a constant
    u, v, w = initial_uvw(Val(:constant), j, ps, qs, x)
    flag, ρ₁, κ = refine_agcd!(u,v,w, ps, qs, atol, rtol)
    return (u=u, v=v, w=w, Θ=ρ₁, κ=κ)

end

function ngcd(ps::Vector{T}, qs::Vector{S}; kwargs...) where {T, S}
    ps′,qs′ = promote(float.(ps), float(qs))
    ngcd(ps′, qs′; kwargs...)
end

# fix the degree, k
function ngcd(ps::Vector{T},
              qs::Vector{T},
              k::Int
              ;
              kwargs...
              ) where {T <: AbstractFloat}

    m, n = _degree.((ps,qs))

    if m < n
        out = ngcd(qs, ps, k, atol=atol, rtol=rtol)
        return (u=out.u, v=out.w, w=out.v, Θ=out.Θ, κ=out.κ)
    end

    u,v,w = initial_uvw(Val(:iszero), k, ps, qs, nothing)
    flag, ρ₁, κ = refine_agcd!(u,v,w, ps, qs, Inf, Inf)
    return (u=u, v=v, w=w, Θ=ρ₁, κ=κ) 

end

function(ps::Vector{T}, qs::Vector{S}, k::Int; kwargs...) where {T, S}
    ps′,qs′ = promote(float.(ps), float(qs))
    ngcd(ps′, qs′, k; kwargs...)
end

## -----

# return guess at smallest singular value and right sinuglar value, x
# for a right triangular matrix, V
function smallest_singular_value(V::AbstractArray{T,2},
                                 atol=eps(T),
                                 rtol=zero(T)) where {T}

    R = UpperTriangular(V)
    k = size(R)[1]/2
    if iszero(det(R)) # 
        return (:iszero, zero(T), T[])
    end

    m,n = size(R)

    # we are testing if ||Ax|| ≈ 0
    # If x is a perfect 0, but x is approximated by x' due to round off
    # then ||A(x-x')|| <= ||A||⋅||x - x'|| so we use ||A|| as scale factor
    δ = max(atol,  norm(R,Inf) * rtol)

    x = ones(T, n)
    y = zeros(T, m)
    σ₀ = σ₁ = Inf*one(real(T))
    steps, minᵢ = 1, 5

    while true
        y .= R' \ x # use iteration to converge on singular value
        x .= R  \ y
        x ./= norm(x,2)
        σ₁ = norm(R * x, 2)

        if (steps <= 50) && (steps <= minᵢ || σ₁ <= 0.05*σ₀) # decreasing, keep going
            σ₀ = σ₁
        else
            break
        end

        steps += 1
    end

    if σ₁ < δ
        return (:ispossible, σ₁, x)
    else
        return (:constant, σ₁, x)
    end

end

# extend QR to next size
# Q gets a 1 in nc,nc, 0s should be elswhere
function extend_QR!(Q,R, nr, nc, A0)

    #old Q is m x m, old R is n x n; we add to these
    n = nc-2 
    m = nr - 1
    k,l = size(A0)

    # add two columns to R
    # need to apply Q to top part of new columns
    R[nr-k+1:nr, (nc-1):nc] = A0
    R[1:nr-1, (nc-1):nc] = (view(Q, 1:nr-1, 1:nr-1))' *  R[1:nr-1, (nc-1):nc]

    # extend Q with row and column with identity 
    Q[nr,nr] = 1

    # Make R upper triangular using Givens rotations
    for j in nr-1:-1:nc-1
        gj,_ = givens(R[j,nc-1], R[j+1,nc-1], j, j+1)
        rmul!(Q, gj')
        lmul!(gj, R)
    end

    for j in nr-1:-1:nc
        gj,_ = givens(R[j,nc], R[j+1,nc], j, j+1)
        rmul!(Q, gj')
        lmul!(gj, R)
    end

    return nothing

end

## --------------------------------------------------
## Refine u,v,w

## Find u₀,v₀,w₀ from right singular vector
function initial_uvw(::Val{:ispossible}, j, ps, qs, x)

    # Sk*[w;-v] = 0, so pick out v,w after applying permuation
    m,n = _degree.((ps, qs))
    vᵢ = vcat(2:m-n+2, m-n+4:2:length(x))
    wᵢ = vcat(1, (m-n+3):2:length(x))
    v = -x[vᵢ]
    w = x[wᵢ]
    # p194 3.9 C_k(v) u = p or Ck(w) u = q; this uses 10.2
    u = [convmtx(v,j+1); convmtx(w, j+1)] \ [ps; qs]

    return u, v, w

end

function initial_uvw(::Val{:iszero}, j, ps::Vector{T}, qs, x) where {T}
    m,n = _degree.((ps,qs))
    S = [convmtx(ps, n-j+1) convmtx(qs, m-j+1)]

    F = qr(S)
    R = UpperTriangular(F.R)
    x = ones(T, size(R,2))
    x .= R \ (R' \ (x/norm(x)))
    x ./= norm(x)

    w = x[1:n-j+1]
    v = -x[(n-j+2):end]

    u = [convmtx(v, j+1); convmtx(w, j+1)] \ [ps;qs]

    (u,v,w)
end

function initial_uvw(::Val{:constant}, j, ps::Vector{T}, qs, x) where {T}
    u = ones(T,1)
    w = qs
    v = ps
    u,v,w
end

# find estimate for σ₂, used in a condition number (κ = 1/σ)
function σ₂(J)
    F = qr(J)
    flag, σ, x = smallest_singular_value(F.R)
    σ
end

## attempt to refine u,v,w
## check that [u ⊗ v; u ⊗ w] ≈ [p; q]
function refine_agcd!(u::Vector{T}, v, w, p, q, atol, rtol) where {T}

    m, n, l =  _degree.((u, v, w))

    uv = u ⊗ v; uw = u ⊗ w
    ρₒ, ρ₁ = one(T), residual_error(p,q,uv,uw)

    # storage
    A = zeros(T, JF_size(u, v, w)...)
    b = zeros(T, 1 + length(p) + length(q))
    Δf = zeros(T, m + n + l + 3)
    steps = 0

    h, β =  u, norm(u)^2
    minᵢ, Maxᵢ = 5, 20
    κ = NaN

    JF!(A, h, u, v, w)
    Fmp!(b,  h'*u - β, p, q, uv, uw)

    # sensitivity is Δu / Δp -> || J+ ||
    # so we would use cond(A)/norm(A) * ||[p;q]|| as scale factor
    # we estimate κ directly here
    #λ = norm(A, Inf)
    λ = norm((norm(p), norm(q))) / σ₂(A)
    ρ = max(atol, rtol*λ) 

    while  true 
        # update A, b, then solve A\b
        Δf .= A \ b

        Δu, Δv, Δw = Δf[1:(m+1)], Δf[(m+2):(m+n+2)], Δf[(m+n+3):end]

        uv, uw =  (u-Δu) ⊗ (v-Δv),  (u-Δu) ⊗ (w-Δw)
        ρ₀, ρ′ = ρ₁, residual_error(p, q, uv, uw)

        # don't worry about first few, but aftewards each step must be productive
        if  (steps <= Maxᵢ) && (steps <= minᵢ || ρ′ < 0.05 * ρ₀)
            ρ₁ = ρ′
            u .-= Δu; v .-= Δv; w .-= Δw
            steps += 1
        else
            break
        end

        # update A,b for next iteration
        JF!(A, h, u, v, w)
        Fmp!(b,  h'*u - β, p, q, uv, uw)

    end

    if ρ₁ <= ρ
        κ = 1/σ₂(A)
        return :convergence, ρ₁, κ
    else
        return :no_convergence, ρ₁, NaN
    end

end

## Jacobian F(u,v,w) = [p,p'] is J(u,v,w)
function JF_size(u, v, w)

    m, k, j = _degree(u), _degree(v), _degree(w)
    n, l = m + k, m + j

    ai, aj = convmtx_size(v, m + 1)
    bi, bj = convmtx_size(u, k + 1)
    di, dj = convmtx_size(w, m + 1)
    fi, fj = convmtx_size(u, j + 1)
    ci, cj = ai, fj
    ei, ej = di, bj

    (1 + ai + di, aj + bj + cj)
end

# Jacobian matrix
function JF(u::Vector{U}, v::Vector{V}, w::Vector{W}) where {U,V,W}
    R = promote_type(U,V, W)
    M = zeros(R, JF_size(u, v, w)...)
    JF!(M, u, v, w)
    M
end

function JF!(M, h,  u::Vector{T}, v, w) where {T}
    m, k, j = _degree(u), _degree(v), _degree(w)
    n, l = m + k, m + j

    # JF_size should return these
    ai, aj = convmtx_size(v, m + 1)
    bi, bj = convmtx_size(u, k + 1)
    di, dj = convmtx_size(w, m + 1)
    fi, fj = convmtx_size(u, j + 1)
    ci, cj = ai, fj
    ei, ej = di, bj

    M[1,1:m+1] = h' 
    convmtx!(view(M, 1 .+ (1:ai), 1:aj),                     v, m + 1)
    convmtx!(view(M, 1 .+ (1:bi), aj .+ (1:bj)),             u, k + 1)
    convmtx!(view(M, (1 + ai) .+ (1:di), 1:dj),              w, m + 1)
    convmtx!(view(M, (1 + ci) .+ (1:fi), dj + ej .+ (1:fj)), u, j + 1)

end

## compute F(u,v,w) - [p, p'] = [u*v, u*w] - [p, p']
function Fmp!(b, γ, p, q, uv, uw)
    b[1] = γ
    for i in 2:2+length(p)-1
        j = i - 1
        b[i] = uv[j] - p[j]
    end
    for i in 2+length(p):length(b)
        j = i - 1 - length(p)
        b[i] = uw[j] - q[j]
    end
    return nothing
end

# norm( [u ⊗ v; u ⊗ w] .- [p; q], 2)
function residual_error(p::Vector{T}, q, uv, uw) where {T}
    tot = zero(T)
    for (pᵢ, uvᵢ) in zip(p,uv)
        tot += (pᵢ-uvᵢ)^2
    end
    for (qᵢ, uwᵢ) in zip(q, uw)
        tot += (qᵢ-uwᵢ)^2
    end
    sqrt(tot)
end

## --------------------------------------------------
## utils

_degree(p::Vector) = length(p) - 1

"""
    conv(p,q)

Convolution of two vectors (or product of two polynomials).
Also `⊗`
"""
function conv(p::Vector{T}, q::Vector{T}) where {T}
    R = promote_type(T,eltype(q))
    n = length(p)-1
    m = length(q)-1
    a = zeros(R, m+n+1)

    @inbounds for i = 1:n+1
        for j = 1:m+1
            a[i+j-1] += p[i] * q[j]
        end
    end
    a
end
⊗(p::Vector{T}, q::Vector{T}) where {T} = conv(p,q)

"""
    convmtx(v, n::Int)
    convmtx!(M,v,n)
    convmtx_size(v,n)

Convolution matrix.
C = convmtx(v,n) returns the convolution matrix C for a vector v. 
If q is a column vector of length n, then C*q is the same as conv(v,q). 

"""

function convmtx!(C, v::AbstractVector{T}, n::Int) where T

    #   Form C as the Toeplitz matrix 
    #   C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1));  put Toeplitz code inline

    nv = length(v)-1

    @inbounds for j = 1:n
        C[j:j+nv,j] = v
    end

    nothing

end

convmtx_size(v::AbstractVector, n) = (n + length(v) - 1, n)
function convmtx(v::AbstractVector{T}, n::Int) where {T}
    C = zeros(T, convmtx_size(v, n)...)
    convmtx!(C, v, n)
    C
end

end

andreasvarga commented 4 years ago

One issue which could target a last refinement of computational efficiency is in the iterative refinement phase. Here the employed matrix JF has the block structure [* 0 0; * * 0; * 0 *] and corresponds to the targeted solution [u; v; w]and right hand side Δf = [h'd; v*d; w*d] - [β; p; q]The matrix JF has a small amount of trailing zeros in its first columns, so using a QR-decomposition based solution benefits only a little of this structure. It would increase the efficiency of solving the equation Jh*Δz = Δf if JF is generated with an alternative block structure [* 0 *; 0 * *; 0 0 *] which corresponds to [v; w; u] and Δf = [v*d; w*d; h'd ] - [p; q; β ] . The reasons for the gain of efficiency is explained here. Finally, to get the expected benefit, instead of Δf .= A \ b, you could use Δf .= qr(A) \ b, which exploits the zero structure. I wonder if the gain of efficiency is worth to restructure you implementation.

If helpful, here is the last version of my implementation of the updating phase:

function gcdvwupd(a::Vector{T}, b::Vector{T}, d::Vector{T}, v::Vector{T}, w::Vector{T}; maxnit::Int = 10) where {T <: Number}
    # Iterative refinement of the accuracy of the greatest common divisor of two polynomials.
    #     gcdvwupd(a, b, d, v, w; maxnit = 0) -> (dupd, vupd, wupd, δ)
    # Given the polynomials a and b, and a triple of polynomials (d, v, w) such that 
    # `d` is an approximation of a greatest common divisor of `a` and `b`, and 
    # v and w are approximate quotients of a/d and b/d, respectively, compute the updated triple 
    # (dupd, vupd, wupd) which minimizes the accuracy estimate δ = norm([a-dupd*vupd; b-dupd*wupd]).
    # A maximum number of maxnit iterations are performed using the Gauss-Newton iteration method, 
    # in the form proposed in [1].
    #
    #  [1] Z. Zeng, The numerical greatest common divisor of univariate polynomials, 
    #      in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.), 
    #      Randomization, Relaxation,and Complexity in Polynomial Equation Solving, 
    #      Contemporary Mathematics,vol.556,AMS,2011,pp.187–217.

    na1 = length(a) 
    nb1 = length(b) 
    nd1 = length(d)
    nv1 = length(v)
    nw1 = length(w)
    na1 == nd1+nv1-1 || error("Incompatible dimensions between a, d and v")
    nb1 == nd1+nw1-1 || error("Incompatible dimensions between b, d and w")

    h = d/norm(d)^2
    f = [a;b;one(T)]
    fh = [conv(v,d); conv(w,d); dot(h,d)]
    Δf = f-fh;
    δ = norm(Δf)
    d0 = copy(d); v0 = copy(v); w0 = copy(w);
    nvw1 = nv1+nw1
    ndvw1 = nd1+nvw1 
    nab1 = na1+nb1
    Jh = [zeros(nab1,ndvw1); zeros(1,nvw1) h' ]

    j1 = 1:nv1
    j2 = nv1+1:nvw1
    j3 = nvw1+1:ndvw1
    i1 = 1:nv1+nd1-1
    i2 = nv1+nd1:nvw1+2*nd1-2
    Jh11 = view(Jh,i1,j1)
    Jh22 = view(Jh,i2,j2)
    Jh13 = view(Jh,i1,j3)
    Jh23 = view(Jh,i2,j3)

    for it = 1:maxnit
        # Jh = [ convmtx(d0,nv1)          0     convmtx(v0,nd1) ] 
        #      [  0             convmtx(u0,nw1) convmtx(w0,nd1) ]
        #      [  0                       0            h'       ] 
        Jh11[:,:] = convmtx(d0,nv1)
        Jh22[:,:] = convmtx(d0,nw1)
        Jh13[:,:] = convmtx(v0,nd1) 
        Jh23[:,:] = convmtx(w0,nd1) 
        # Δz = Jh\Δf;   # this does not exploit the zero structure
        # Δz = qrsolve!(copy(Jh),copy(Δf));  # this is probably the fastest
        Δz = qr(Jh)\Δf;  # this is somewhat slower, but still does the job
        v1 = v0 + Δz[j1];
        w1 = w0 + Δz[j2];
        d1 = d0 + Δz[j3];
        fh1 = [conv(v1,d1); conv(w1,d1); dot(h,d1)];
        Δf1 = f-fh1;
        δ1 = norm(Δf1)
        if δ1 < δ
           d0 = copy(d1); v0 = copy(v1); w0 = copy(w1); δ = copy(δ1); Δf = copy(Δf1);
        else
           break
        end
    end
    return d0, v0, w0, δ 
end

Somewhat faster is to use the line commented out, which calls the following function:

function qrsolve!(A::AbstractMatrix{T},b::AbstractVector{T}) where {T}
    # Fast least-squares solver for full column rank Hessenberg-like matrices
    require_one_based_indexing(A)
    m, n = size(A) 
    m < n && error("Column dimension exceeds row dimension") 
    _, τ = LinearAlgebra.LAPACK.geqrf!(A)
    T <: Complex ? tran = 'C' : tran = 'T'
    LinearAlgebra.LAPACK.ormqr!('L',tran,A,τ,view(b,:,1:1))
    return UpperTriangular(triu(A[1:n,:]))\b[1:n]
end

jverzani commented 4 years ago

Thanks! Let me make that adjustment. Hopefully, I'll have the PR ready by today.

andreasvarga commented 4 years ago

A small correction of timings: the qr(A)\b form is faster for full matrices (because of using blocking algorithm), but not for triangular and Hessenberg-like matrices:

n = 1000; m = 700;
A = rand(n,m); A10 = triu(a,-10); At = triu(a); b = rand(n);
@time  qr(A)\b;
@time qr(At)\b;
@time qr(A10)\b;

Timings:

julia> @time  qr(A)\b;
  0.070456 seconds (18 allocations: 9.483 MiB)

julia> @time qr(At)\b;
  0.064726 seconds (18 allocations: 9.483 MiB)

julia> @time qr(A10)\b;
  0.070714 seconds (18 allocations: 9.483 MiB)

The qrsolve! based variant is faster for triangular and Hessenberg-like shapes:

n = 1000; m = 700;
A = rand(n,m); A10 = triu(a,-10); At = triu(a); b = rand(n);
@time  qrsolve!(A,b);
@time  qrsolve!(At,b);
@time  qrsolve!(A10,b);

Timings:

julia> @time  qrsolve!(A,b);
  0.094381 seconds (18 allocations: 7.697 MiB)

julia> @time  qrsolve!(At,b);
  0.028213 seconds (18 allocations: 7.697 MiB)

julia> @time  qrsolve!(A10,b);
  0.028316 seconds (18 allocations: 7.697 MiB)

The A\b form is the slowest:

n = 1000; m = 700;
A = rand(n,m); A10 = triu(a,-10); At = triu(a); b = rand(n);
@time  A\b;
@time  At\b;
@time  A10\b;

Timings:

julia> @time  A\b;
  0.147010 seconds (5.65 k allocations: 13.607 MiB, 6.20% gc time)

julia> @time  At\b;
  0.129677 seconds (5.57 k allocations: 13.593 MiB)

julia> @time  A10\b;
  0.130805 seconds (5.65 k allocations: 13.762 MiB)

Therefore, the only option remains to use the qrsolve! based computation.

andreasvarga commented 4 years ago

Just for fun, I played with your example for larger values of n. This is what I got for n = 50 using polynomial long division as a first step, which in this case provides the exact solution (without using any tolerance based decision):

x = Polynomial([0,1]);
n = 50;
p1 = (x-1)^n*(x-2)^n*(x-3);
q1 = (x-1)*(x-2)*(x-4);
p = coeffs(p1); q = coeffs(q1); 

d1, r1 = divrem(p1/norm(p),q1/norm(q));
roots(Polynomial(r1[0:2]))

The almost exact result is contained inr1[0:2]:

julia> roots(Polynomial(r1[0:2]))
2-element Array{Float64,1}:
 0.9999999999999996
 2.000000000000001

I recognize I was surprized seeing

julia> r1[0:2]
3-element Array{Float64,1}:
 -1.4396085506034197e51
  2.1594128259051298e51
 -7.198042753017099e50

and also seeing very good backward errors:

norm(coeffs(p1/norm(p)-d1*(q1/norm(q))-r1))/norm(r1[0:2])
9.95555419410239e-19

So, I wonder if in the case when there exists a large gap between the degrees of the two polynomials, it would be a good idea to perform such a first step, and then work as usual with much smaller dimensions ? In such cases, it would have sense to focus on obtaining (somehow) the remainder with high accuracy (this computation is, according to Zeng, ill-conditioned).

jverzani commented 4 years ago

Interesting idea. I was just about to push a PR. I have added it in, but some suggestions on a heuristic as to when to call it are needed. (Also a name, I just add a prime for now.) I was ready to just let that example go, as other cases seemed to be performing well. In the PR I scale by the norm by default. That seems to give much better success.

jverzani commented 4 years ago

Hi @andreasvarga - You can work with this PR or just copy and paste the ngcd.jl file to play with this. As usual, any feedback is most welcome.

JuliaMath / Polynomials.jl

gcd failure #240

Greatest common divisor of two polynomials.

Compute the greatest common divisor d of two polynomials a and

Greatest common divisor of two polynomials.

d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor d of two polynomials a and b

the Euclidian Algorithm with scaling as of [1].

[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to

algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

determine the degree of GCD as the nullity of the Sylvester matrix

this computation can be replaced by simply setting nd = 1, in which

Greatest common divisor of two polynomials.

gcd2(a,b; atol = 0, rtol) -> (d, v, w)

Compute the greatest common divisor d of two polynomials a and

the polynomials v and w such that a = d v and b = d w, using a

adapted from [1] (without iterative accuracy refinement).

[1] Z. Zeng, The numerical greatest common divisor of univariate

in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

Randomization, Relaxation,and Complexity in Polynomial Equation

Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

determine the degree of GCD as the nullity of the Sylvester matrix

determine [w; -v] from an orthogonal/unitary nullspace basis of

of a reduced Sylvester matrix using the last row of Vt from its

determine the GCD d with high accuracy as the solution of a

linear least-squares problem

Convolution matrix.

C = convmtx(v,n) returns the convolution matrix C for a vector v.

If q is a column vector of length n, then C*q is the same as conv(v,q).

Form C as the Toeplitz matrix

C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code

Greatest common divisor of two polynomials.

d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor d of two polynomials a and

the Euclidian Algorithm with scaling as of [1].

[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application to

algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

determine the degree of GCD as the nullity of the Sylvester matrix

this computation can be replaced by simply setting nd = 1, in which

Greatest common divisor of two polynomials.

gcd2(a,b; atol = 0, rtol) -> (d, v, w)

Compute the greatest common divisor d of two polynomials a and

the polynomials v and w such that a = d v and b = d w, using a

adapted from [1] (without iterative accuracy refinement).

[1] Z. Zeng, The numerical greatest common divisor of univariate

in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

Randomization, Relaxation,and Complexity in Polynomial Equation

Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

determine the degree of GCD as the nullity of the Sylvester matrix

determine [w; -v] from an orthogonal/unitary nullspace basis of

of a reduced Sylvester matrix using the last row of Vt from its

determine the GCD d with high accuracy as the solution of a

linear least-squares problem

Convolution matrix.

C = convmtx(v,n) returns the convolution matrix C for a vector v.

If q is a column vector of length n, then C*q is the same as

Form C as the Toeplitz matrix

C = Toeplitz([v; zeros(n-1)],[v[1]; zeros(n-1)); put Toeplitz code

function gcd1(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real

Greatest common divisor of two polynomials.

d = gcd1(a,b; atol = 0, rtol)

Compute the greatest common divisor d of two polynomials a and

the Euclidian Algorithm with scaling as of [1].

[1] M.-T. Noda and T. Sasaki. Approximate GCD and its application

algebraic equations. J. Comput. Appl. Math., 38:335–351, 1991.

determine the degree of GCD as the nullity of the Sylvester matrix

this computation can be replaced by simply setting nd = 1, in which

function gcd2(a::Vector{T}, b::Vector{S}; atol::Real = 0, rtol::Real

Greatest common divisor of two polynomials.

gcd2(a,b; atol = 0, rtol) -> (d, v, w)

Compute the greatest common divisor d of two polynomials a and

the polynomials v and w such that a = d v and b = d w, using a

adapted from [1] (without iterative accuracy refinement).

[1] Z. Zeng, The numerical greatest common divisor of univariate

in: L. Gurvits, P. Pébay, J.M. Rojas, D. Thompson (Eds.),

Randomization, Relaxation,and Complexity in Polynomial Equation

Contemporary Mathematics, vol.556, AMS, 2011, pp.187–217.

determine the degree of GCD as the nullity of the Sylvester matrix

determine [w; -v] from an orthogonal/unitary nullspace basis of

of a reduced Sylvester matrix using the last row of Vt from its

Compute the greatest common divisor `d` of two polynomials `a` and

Compute the greatest common divisor `d` of two polynomials `a` and `b`

Compute the greatest common divisor `d` of two polynomials `a` and

Compute the greatest common divisor `d` of two polynomials `a` and

Compute the greatest common divisor `d` of two polynomials `a` and

Compute the greatest common divisor `d` of two polynomials `a` and

Compute the greatest common divisor `d` of two polynomials `a` and