wrong result wrt to ForwardDiff

rveltz commented 6 months ago

I had to cook a MWE from a finite volume code.

Why isn't it giving the same result?

function foo!(out,x)
    N = length(x)
    z = zero(eltype(x))
    for i in 1:N
        yp = (i<N) ? x[i+1] : z
        ym = (i>1) ? x[i-1] : z
        d = yp - ym
        if d<0
            d = 2yp - ym + 0.1
        end
        out[i] = max(yp - ym, 0) + d
    end
    out
end

g0 = rand(10)
n = length(g0)
L = ForwardDiff.jacobian( x->foo!(similar(x),x), g0)
L1 = spdiagm(1=>ones(n-1), -1=>ones(n-1))
colors = matrix_colors(L1)
Ld = forwarddiff_color_jacobian!(L1, foo!, zeros(n); colorvec = colors)
norm(L-Ld,Inf)

julia> L
10×10 Matrix{Float64}:
  0.0   2.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0  0.0
 -1.0   0.0   2.0   0.0   0.0   0.0   0.0   0.0   0.0  0.0
  0.0  -1.0   0.0   2.0   0.0   0.0   0.0   0.0   0.0  0.0
  0.0   0.0  -1.0   0.0   2.0   0.0   0.0   0.0   0.0  0.0
  0.0   0.0   0.0  -2.0   0.0   2.0   0.0   0.0   0.0  0.0
  0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0   0.0  0.0
  0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0  0.0
  0.0   0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0  0.0
  0.0   0.0   0.0   0.0   0.0   0.0   0.0  -2.0   0.0  2.0
  0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0  -1.0  0.0

julia> Ld
10×10 SparseMatrixCSC{Float64, Int64} with 18 stored entries:
   ⋅    1.0    ⋅     ⋅     ⋅     ⋅     ⋅     ⋅     ⋅    ⋅ 
 -1.0    ⋅    1.0    ⋅     ⋅     ⋅     ⋅     ⋅     ⋅    ⋅ 
   ⋅   -1.0    ⋅    1.0    ⋅     ⋅     ⋅     ⋅     ⋅    ⋅ 
   ⋅     ⋅   -1.0    ⋅    1.0    ⋅     ⋅     ⋅     ⋅    ⋅ 
   ⋅     ⋅     ⋅   -1.0    ⋅    1.0    ⋅     ⋅     ⋅    ⋅ 
   ⋅     ⋅     ⋅     ⋅   -1.0    ⋅    1.0    ⋅     ⋅    ⋅ 
   ⋅     ⋅     ⋅     ⋅     ⋅   -1.0    ⋅    1.0    ⋅    ⋅ 
   ⋅     ⋅     ⋅     ⋅     ⋅     ⋅   -1.0    ⋅    1.0   ⋅ 
   ⋅     ⋅     ⋅     ⋅     ⋅     ⋅     ⋅   -1.0    ⋅   1.0
   ⋅     ⋅     ⋅     ⋅     ⋅     ⋅     ⋅     ⋅   -1.0   ⋅

ChrisRackauckas commented 6 months ago

This is effectively the same classic AD issue as differentiating the identity function written as f(x) = x == 0 ? 0 : x. In theory and practice AD can be incorrect on the measure zero set where the branching occurs. It's something one has to be careful of. With sparse AD, you seem to need to be even a little more careful of what's going on around zero. When it combines perturbations it ends up changing what is actually computed because you have exact zeros right at the floating point of the branching.

rveltz commented 6 months ago

Wow That means I cannot differentiate the fvm code? The foo! function is very close to what you would find in fvm code of order 2 with (minmod) limiter

ChrisRackauckas commented 6 months ago

No you just need to be careful of the definition at 0. I already shared the fixed version:


using SparseDiffTools, ForwardDiff, SparseArrays

function foo!(out,x)
    N = length(x)
    z = zero(eltype(x))
    for i in 1:N
        yp = (i<N) ? x[i+1] : z
        ym = (i>1) ? x[i-1] : z
        d = yp - ym
        if d<=0
            d = 2yp - ym + 0.1
        end
        out[i] = max(yp - ym, 0) + d
    end
    out
end

g0 = rand(10)
n = length(g0)
L = ForwardDiff.jacobian( x->foo!(similar(x),x), g0)
L1 = spdiagm(1=>ones(n-1), -1=>ones(n-1))
colors = matrix_colors(L1)
Ld = forwarddiff_color_jacobian!(L1, foo!, zeros(n); colorvec = colors)
norm(L-Ld,Inf)

rveltz commented 6 months ago

this does not work for me

ChrisRackauckas commented 6 months ago

Sorry, wrong one:

using SparseDiffTools, ForwardDiff, SparseArrays

function foo!(out,x)
    N = length(x)
    z = zero(eltype(x))
    for i in 1:N
        yp = (i<N) ? x[i+1] : z
        ym = (i>1) ? x[i-1] : z
        d = yp - ym
        if d<0
            d = 2yp - ym + 0.1
        end
        out[i] = max(yp - ym, 0) + d
    end
    out
end

g0 = rand(10)
n = length(g0)
L = ForwardDiff.jacobian( x->foo!(similar(x),x), g0)
L1 = spdiagm(1=>ones(n-1), -1=>ones(n-1))
colors = matrix_colors(L1)
Ld = forwarddiff_color_jacobian!(L1, foo!, g0)
norm(L-Ld,Inf)

The other thing was... you need to calculate the derivative at g0 in both cases, not at g0 in one and with the zero vector in the other. And when you do that, it actually works out fine in all cases.

JuliaDiff / SparseDiffTools.jl

wrong result wrt to ForwardDiff #278