Closed gdalle closed 6 months ago
Looks like it could be done with KrylovKit.jl, perhaps at the price of efficiency (no in-place version of the solver)
Since we have no in-place differentiation (#48) and it does not look likely to happen soon, maybe it would make sense to switch to KrylovKit.jl for this reason? @mohamed82008
Let's think some more about getting AD.jl to work. We need API suggestions and a PR. I am not opposed to the idea in principle but just need to be convinced that this can give us significant performance gains.
Ok, labelling this as low priority
I suggest we close this and open issues for specific array types, e.g sparse arrays when a use case arises.
I think @thorek1 had a use case
indeed, and I solved it in a tailored rewrite of the ImplicitFunctions.jl way of doing it (see lines 3000-3100 here). the problem with ForwarDiff.jl and SparseArrays is that value.(SparseVectorA) and partials.(SparseVectorA) return wrong results (I didn't test sparse matrices). you need to reimplement those two and refactor some code downstream to take advantage of the sparsity.
I don't think we're gonna go that far for sparse array support
I think we need a more in-depth analysis of where exactly sparsity gets lost in ImplicitDifferentiation with an example. I suspect if sparsity is lost, it's mostly due to upstream packages, e.g ChainRules, not caring about sparsity. See the discussion in https://discourse.julialang.org/t/zygote-jl-how-to-get-the-gradient-of-sparse-matrix/59067. As of now, this issue is not actionable because there is no MWE, it's not specific to one array type and we don't know if we need to fix ImplicitDifferentiation or an upstream package to close this issue.
@thorek1 following up on sparsity: if the AD backends destroy it, I don't think it's our job to fix it. But I'm happy to entertain a debate
I don't think that's the case really. As in ForwardDiff with sparse matrices works but the way they are handled within ImplicitDifferentiation doesn't seem to work. I would lean towards improving on the way how it is handled internally to get it to work
Here's a very very failing example:
Setup
julia> using ForwardDiff
julia> using ImplicitDifferentiation
julia> using Zygote
julia> using SparseArrays
julia> forward(x) = sqrt.(x);
julia> conditions(x, y) = abs2.(y) .- x;
julia> implicit = ImplicitFunction(forward, conditions);
Sparse vector
julia> x = sprand(5, 0.3)
5-element SparseVector{Float64, Int64} with 2 stored entries:
[4] = 0.778476
[5] = 0.501691
julia> implicit(x)
5-element SparseVector{Float64, Int64} with 2 stored entries:
[4] = 0.882313
[5] = 0.708302
julia> Zygote.jacobian(implicit, x)[1] # NaNs where there should be zeros
┌ Warning: IterativeLinearSolver failed, result contains NaNs
└ @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:32
...
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⋅ ⋅ ⋅ NaN NaN
⋅ ⋅ ⋅ NaN NaN
⋅ ⋅ ⋅ NaN NaN
⋅ ⋅ ⋅ 0.566692 ⋅
⋅ ⋅ ⋅ ⋅ 0.705914
julia> ForwardDiff.jacobian(implicit, x) # zeros everywhere
5×5 Matrix{Float64}:
-0.0 -0.0 -0.0 -0.0 -0.0
-0.0 -0.0 -0.0 -0.0 -0.0
-0.0 -0.0 -0.0 -0.0 -0.0
-0.0 -0.0 -0.0 -0.0 -0.0
-0.0 -0.0 -0.0 -0.0 -0.0
Sparse matrix
julia> X = sprand(5, 5, 0.3)
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
0.43017 0.100286 ⋅ 0.436475 ⋅
⋅ ⋅ 0.880371 ⋅ ⋅
0.129674 ⋅ ⋅ ⋅ ⋅
⋅ ⋅ 0.585982 ⋅ 0.990759
⋅ ⋅ 0.66023 ⋅ ⋅
julia> implicit(X)
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
0.655873 0.31668 ⋅ 0.660662 ⋅
⋅ ⋅ 0.938281 ⋅ ⋅
0.360102 ⋅ ⋅ ⋅ ⋅
⋅ ⋅ 0.765495 ⋅ 0.995369
⋅ ⋅ 0.812545 ⋅ ⋅
julia> Zygote.jacobian(implicit, X)[1] # way too dense
┌ Warning: IterativeLinearSolver failed, result contains NaNs
└ @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:32
...
25×25 SparseMatrixCSC{Float64, Int64} with 569 stored entries:
⎡⣻⣾⣗⣿⣿⣗⣗⣒⣿⣿⣿⣗⡇⎤
⎢⣽⣽⣿⣿⣿⣯⣯⣭⣿⣿⣿⣯⡇⎥
⎢⢿⢿⡿⣿⣿⣿⡿⠿⣿⣿⣿⡿⡇⎥
⎢⢹⢹⡏⣿⣿⡏⡟⢍⣿⣿⣿⡏⡇⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎢⢿⢿⡿⣿⣿⡿⡿⠿⣿⣿⣿⣿⡇⎥
⎣⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁⎦
julia> ForwardDiff.jacobian(implicit, X) # error
ERROR: MethodError: no method matching Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}(::UndefInitializer, ::Int64)
Closest candidates are:
(::Type{Base.ReshapedArray{T, N, P, MI}} where {T, N, P<:AbstractArray, MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}})(::Any, ::Any, ::Any)
@ Base reshapedarray.jl:6
Stacktrace:
[1] Krylov.GmresSolver(m::Int64, n::Int64, memory::Int64, S::Type{Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}})
@ Krylov ~/.julia/packages/Krylov/Iuavd/src/krylov_solvers.jl:1549
[2] Krylov.GmresSolver(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, memory::Int64)
@ Krylov ~/.julia/packages/Krylov/Iuavd/src/krylov_solvers.jl:1567
[3] gmres(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}; memory::Int64, M::LinearAlgebra.UniformScaling{Bool}, N::LinearAlgebra.UniformScaling{Bool}, ldiv::Bool, restart::Bool, reorthogonalization::Bool, atol::Float64, rtol::Float64, itmax::Int64, timemax::Float64, verbose::Int64, history::Bool, callback::Krylov.var"#164#171", iostream::Core.CoreSTDOUT)
@ Krylov ~/.julia/packages/Krylov/Iuavd/src/gmres.jl:121
[4] gmres(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
@ Krylov ~/.julia/packages/Krylov/Iuavd/src/gmres.jl:119
[5] solve(#unused#::IterativeLinearSolver, A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
@ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:27
[6] (::ImplicitDifferentiationForwardDiffExt.var"#2#5"{Float64, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#5#7"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, SparseMatrixCSC{Float64, Int64}})(k::Int64)
@ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:39
[7] macro expansion
@ ./ntuple.jl:72 [inlined]
[8] ntuple
@ ./ntuple.jl:69 [inlined]
[9] (::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing})(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:35
[10] (::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing})(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64})
@ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:25
[11] chunk_mode_jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}})
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:183
[12] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}}, ::Val{true})
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:23
[13] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}})
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:19
[14] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64})
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:19
[15] top-level scope
@ ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/test/playground.jl:22
Correct results
julia> vecsqrt(x) = sqrt.(x);
julia> Zygote.jacobian(vecsqrt, x)[1]
5×5 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ 0.566692 ⋅
⋅ ⋅ ⋅ ⋅ 0.705914
julia> ForwardDiff.jacobian(vecsqrt, x)
5×5 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ 0.566692 ⋅
⋅ ⋅ ⋅ ⋅ 0.705914
julia> Zygote.jacobian(vecsqrt, X)[1]
25×25 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⎡⠁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠐⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠐⢄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎦
julia> ForwardDiff.jacobian(vecsqrt, X)
25×25 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⎡⠁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠐⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠐⢄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎦
Exactly!
The workaround I used to get it to work in the sparse vector + ForwardDiff case was to extract the values and partials in a different way:
function separate_values_and_partials_from_sparsevec_dual(V::SparseVector{ℱ.Dual{Z,S,N}}; tol::AbstractFloat = eps()) where {Z,S,N}
nrows = length(V)
ncols = length(V.nzval[1].partials)
rows = Int[]
cols = Int[]
prtls = Float64[]
for (i,v) in enumerate(V.nzind)
for (k,w) in enumerate(V.nzval[i].partials)
if abs(w) > tol
push!(rows,v)
push!(cols,k)
push!(prtls,w)
end
end
end
vvals = sparsevec(V.nzind,[i.value for i in V.nzval],nrows)
ps = sparse(rows,cols,prtls,nrows,ncols)
return vvals, ps
end
I tried to play around to at least fix the broken results even if they're dense, but I ran into https://github.com/JuliaDiff/ForwardDiff.jl/issues/658
If you have any ideas...
I don't think that's relevant in the ImplicitDiff case. As in when I tested it against FiniteDifferences it worked out fine. But I might be overseeing something here (head is stuck on another problem right now).
It is relevant because the conditions are zero at optimality, so the sparse arrays cancel each other and we get wrong derivatives when constructing the operators A
and B
It's the reason why we get zeros everywhere with ForwardDiff in the sparse vector example above
kk i see (now with a bit more brainpower :) ). then there might be a separate issue related to partials not being extracted properly in the implicitdiff code
Note to self: sparse arrays throw byproduct-related pullback errors while other arrays do not See https://github.com/JuliaDiff/ChainRules.jl/issues/731
julia> x = sparse(rand(Float32, 2, 3));
julia> forward(x) = sqrt.(x), 2;
julia> conditions(x, y, z) = abs.(y) .^ z .- x;
julia> implicit = ImplicitFunction(forward, conditions);
julia> Zygote.pullback(first ∘ implicit, x)
ERROR: MethodError: no method matching zero(::Tuple{Float32, Zygote.ZBack{ChainRules.var"#power_pullback#1338"{Float32, Int64, ProjectTo{Float64, @NamedTuple{}}, ProjectTo{Float32, @NamedTuple{}}, Float32}}})
So I think ForwardDiff is never gonna work out of the box unless we copy @thorek1's rather tedious reshaping trick, but Zygote now works and returns sparse Jacobians as of #114
Great stuff that zygote works. That would be the most efficient backend in the sparse part of my code anyway.
@forward diff: at least you get a chance to build a sparse pipeline with forwarddiff and implicitdiff.
For lack of less involved alternatives at the moment I am going for analytical implicit diff solutions in the sparse part but am happy to switch if possible (and fast enough).
The approach in v0.6.0 following #135 is to
This saves many headaches and follows a lengthy Discourse thread where we found out that Enzyme does it just fine
At the moment, sparse arrays would be densified when turned to vectors. It would be nice to be agnostic to the array type ping @mfherbst