JuliaDecisionFocusedLearning / ImplicitDifferentiation.jl

Automatic differentiation of implicit functions
https://juliadecisionfocusedlearning.github.io/ImplicitDifferentiation.jl/
MIT License
122 stars 6 forks source link

Support for sparse arrays #21

Closed gdalle closed 6 months ago

gdalle commented 2 years ago

At the moment, sparse arrays would be densified when turned to vectors. It would be nice to be agnostic to the array type ping @mfherbst

gdalle commented 1 year ago

Looks like it could be done with KrylovKit.jl, perhaps at the price of efficiency (no in-place version of the solver)

gdalle commented 1 year ago

Since we have no in-place differentiation (#48) and it does not look likely to happen soon, maybe it would make sense to switch to KrylovKit.jl for this reason? @mohamed82008

mohamed82008 commented 1 year ago

Let's think some more about getting AD.jl to work. We need API suggestions and a PR. I am not opposed to the idea in principle but just need to be convinced that this can give us significant performance gains.

gdalle commented 1 year ago

Ok, labelling this as low priority

mohamed82008 commented 1 year ago

I suggest we close this and open issues for specific array types, e.g sparse arrays when a use case arises.

gdalle commented 1 year ago

I think @thorek1 had a use case

thorek1 commented 1 year ago

indeed, and I solved it in a tailored rewrite of the ImplicitFunctions.jl way of doing it (see lines 3000-3100 here). the problem with ForwarDiff.jl and SparseArrays is that value.(SparseVectorA) and partials.(SparseVectorA) return wrong results (I didn't test sparse matrices). you need to reimplement those two and refactor some code downstream to take advantage of the sparsity.

gdalle commented 1 year ago

I don't think we're gonna go that far for sparse array support

mohamed82008 commented 1 year ago

I think we need a more in-depth analysis of where exactly sparsity gets lost in ImplicitDifferentiation with an example. I suspect if sparsity is lost, it's mostly due to upstream packages, e.g ChainRules, not caring about sparsity. See the discussion in https://discourse.julialang.org/t/zygote-jl-how-to-get-the-gradient-of-sparse-matrix/59067. As of now, this issue is not actionable because there is no MWE, it's not specific to one array type and we don't know if we need to fix ImplicitDifferentiation or an upstream package to close this issue.

gdalle commented 1 year ago

@thorek1 following up on sparsity: if the AD backends destroy it, I don't think it's our job to fix it. But I'm happy to entertain a debate

thorek1 commented 1 year ago

I don't think that's the case really. As in ForwardDiff with sparse matrices works but the way they are handled within ImplicitDifferentiation doesn't seem to work. I would lean towards improving on the way how it is handled internally to get it to work

gdalle commented 1 year ago

Here's a very very failing example:

Setup

julia> using ForwardDiff

julia> using ImplicitDifferentiation

julia> using Zygote

julia> using SparseArrays

julia> forward(x) = sqrt.(x);

julia> conditions(x, y) = abs2.(y) .- x;

julia> implicit = ImplicitFunction(forward, conditions);

Sparse vector

julia> x = sprand(5, 0.3)
5-element SparseVector{Float64, Int64} with 2 stored entries:
  [4]  =  0.778476
  [5]  =  0.501691

julia> implicit(x)
5-element SparseVector{Float64, Int64} with 2 stored entries:
  [4]  =  0.882313
  [5]  =  0.708302

julia> Zygote.jacobian(implicit, x)[1]  # NaNs where there should be zeros
┌ Warning: IterativeLinearSolver failed, result contains NaNs
└ @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:32
...
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
  ⋅    ⋅    ⋅   NaN         NaN
  ⋅    ⋅    ⋅   NaN         NaN
  ⋅    ⋅    ⋅   NaN         NaN
  ⋅    ⋅    ⋅     0.566692     ⋅ 
  ⋅    ⋅    ⋅      ⋅          0.705914

julia> ForwardDiff.jacobian(implicit, x)  # zeros everywhere
5×5 Matrix{Float64}:
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0

Sparse matrix

julia> X = sprand(5, 5, 0.3)
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
 0.43017   0.100286   ⋅        0.436475   ⋅ 
  ⋅         ⋅        0.880371   ⋅         ⋅ 
 0.129674   ⋅         ⋅         ⋅         ⋅ 
  ⋅         ⋅        0.585982   ⋅        0.990759
  ⋅         ⋅        0.66023    ⋅         ⋅ 

julia> implicit(X)
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
 0.655873  0.31668   ⋅        0.660662   ⋅ 
  ⋅         ⋅       0.938281   ⋅         ⋅ 
 0.360102   ⋅        ⋅         ⋅         ⋅ 
  ⋅         ⋅       0.765495   ⋅        0.995369
  ⋅         ⋅       0.812545   ⋅         ⋅ 

julia> Zygote.jacobian(implicit, X)[1]  # way too dense
┌ Warning: IterativeLinearSolver failed, result contains NaNs
└ @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:32
...
25×25 SparseMatrixCSC{Float64, Int64} with 569 stored entries:
⎡⣻⣾⣗⣿⣿⣗⣗⣒⣿⣿⣿⣗⡇⎤
⎢⣽⣽⣿⣿⣿⣯⣯⣭⣿⣿⣿⣯⡇⎥
⎢⢿⢿⡿⣿⣿⣿⡿⠿⣿⣿⣿⡿⡇⎥
⎢⢹⢹⡏⣿⣿⡏⡟⢍⣿⣿⣿⡏⡇⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎢⢿⢿⡿⣿⣿⡿⡿⠿⣿⣿⣿⣿⡇⎥
⎣⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁⎦

julia> ForwardDiff.jacobian(implicit, X)  # error
ERROR: MethodError: no method matching Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}(::UndefInitializer, ::Int64)

Closest candidates are:
  (::Type{Base.ReshapedArray{T, N, P, MI}} where {T, N, P<:AbstractArray, MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}})(::Any, ::Any, ::Any)
   @ Base reshapedarray.jl:6

Stacktrace:
  [1] Krylov.GmresSolver(m::Int64, n::Int64, memory::Int64, S::Type{Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}})
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/krylov_solvers.jl:1549
  [2] Krylov.GmresSolver(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, memory::Int64)
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/krylov_solvers.jl:1567
  [3] gmres(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}; memory::Int64, M::LinearAlgebra.UniformScaling{Bool}, N::LinearAlgebra.UniformScaling{Bool}, ldiv::Bool, restart::Bool, reorthogonalization::Bool, atol::Float64, rtol::Float64, itmax::Int64, timemax::Float64, verbose::Int64, history::Bool, callback::Krylov.var"#164#171", iostream::Core.CoreSTDOUT)
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/gmres.jl:121
  [4] gmres(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/gmres.jl:119
  [5] solve(#unused#::IterativeLinearSolver, A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
    @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:27
  [6] (::ImplicitDifferentiationForwardDiffExt.var"#2#5"{Float64, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#5#7"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, SparseMatrixCSC{Float64, Int64}})(k::Int64)
    @ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:39
  [7] macro expansion
    @ ./ntuple.jl:72 [inlined]
  [8] ntuple
    @ ./ntuple.jl:69 [inlined]
  [9] (::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing})(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:35
 [10] (::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing})(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64})
    @ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:25
 [11] chunk_mode_jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:183
 [12] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}}, ::Val{true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:23
 [13] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:19
 [14] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:19
 [15] top-level scope
    @ ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/test/playground.jl:22

Correct results

julia> vecsqrt(x) = sqrt.(x);

julia> Zygote.jacobian(vecsqrt, x)[1]
5×5 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅   0.566692   ⋅ 
  ⋅    ⋅    ⋅    ⋅        0.705914

julia> ForwardDiff.jacobian(vecsqrt, x)
5×5 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅   0.566692   ⋅ 
  ⋅    ⋅    ⋅    ⋅        0.705914

julia> Zygote.jacobian(vecsqrt, X)[1]
25×25 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⎡⠁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠐⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠐⢄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎦

julia> ForwardDiff.jacobian(vecsqrt, X)
25×25 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⎡⠁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠐⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠐⢄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎦
thorek1 commented 1 year ago

Exactly!

The workaround I used to get it to work in the sparse vector + ForwardDiff case was to extract the values and partials in a different way:


function separate_values_and_partials_from_sparsevec_dual(V::SparseVector{ℱ.Dual{Z,S,N}}; tol::AbstractFloat = eps()) where {Z,S,N}
    nrows = length(V)
    ncols = length(V.nzval[1].partials)

    rows = Int[]
    cols = Int[]

    prtls = Float64[]

    for (i,v) in enumerate(V.nzind)
        for (k,w) in enumerate(V.nzval[i].partials)
            if abs(w) > tol
                push!(rows,v)
                push!(cols,k)
                push!(prtls,w)
            end
        end
    end

    vvals = sparsevec(V.nzind,[i.value for i in V.nzval],nrows)
    ps = sparse(rows,cols,prtls,nrows,ncols)

    return vvals, ps
end
gdalle commented 1 year ago

I tried to play around to at least fix the broken results even if they're dense, but I ran into https://github.com/JuliaDiff/ForwardDiff.jl/issues/658

If you have any ideas...

thorek1 commented 1 year ago

I don't think that's relevant in the ImplicitDiff case. As in when I tested it against FiniteDifferences it worked out fine. But I might be overseeing something here (head is stuck on another problem right now).

gdalle commented 1 year ago

It is relevant because the conditions are zero at optimality, so the sparse arrays cancel each other and we get wrong derivatives when constructing the operators A and B

gdalle commented 1 year ago

It's the reason why we get zeros everywhere with ForwardDiff in the sparse vector example above

thorek1 commented 1 year ago

kk i see (now with a bit more brainpower :) ). then there might be a separate issue related to partials not being extracted properly in the implicitdiff code

gdalle commented 1 year ago

Note to self: sparse arrays throw byproduct-related pullback errors while other arrays do not See https://github.com/JuliaDiff/ChainRules.jl/issues/731

julia> x = sparse(rand(Float32, 2, 3));

julia> forward(x) = sqrt.(x), 2;

julia> conditions(x, y, z) = abs.(y) .^ z .- x;

julia> implicit = ImplicitFunction(forward, conditions);

julia> Zygote.pullback(first ∘ implicit, x)
ERROR: MethodError: no method matching zero(::Tuple{Float32, Zygote.ZBack{ChainRules.var"#power_pullback#1338"{Float32, Int64, ProjectTo{Float64, @NamedTuple{}}, ProjectTo{Float32, @NamedTuple{}}, Float32}}})
gdalle commented 1 year ago

So I think ForwardDiff is never gonna work out of the box unless we copy @thorek1's rather tedious reshaping trick, but Zygote now works and returns sparse Jacobians as of #114

thorek1 commented 1 year ago

Great stuff that zygote works. That would be the most efficient backend in the sparse part of my code anyway.

@forward diff: at least you get a chance to build a sparse pipeline with forwarddiff and implicitdiff.

For lack of less involved alternatives at the moment I am going for analytical implicit diff solutions in the sparse part but am happy to switch if possible (and fast enough).

gdalle commented 6 months ago

The approach in v0.6.0 following #135 is to

This saves many headaches and follows a lengthy Discourse thread where we found out that Enzyme does it just fine

https://discourse.julialang.org/t/how-do-you-speed-up-the-linear-sparse-solver-in-zygote/111801/24?u=gdalle