In-place `*_jacvec!` assumes square jacobian

danielwe commented 2 years ago

auto_jacvec! and num_jacvec! only work when the output and input dimensions are the same, i.e., the jacobian is square. The issue is this line (and the equivalent for num_jacvec!), which should probably use both the length and eltype of dy.

https://github.com/JuliaDiff/SparseDiffTools.jl/blob/2dbe4206d107d55538eb72b33aca727169130849/src/differentiation/jaches_products.jl#L10

Manual cache allocation works but is surprisingly hard to get right, the implicit prescription in the README gets the wrong tag type for the duals: https://github.com/JuliaDiff/SparseDiffTools.jl#jacobian-vector-and-hessian-vector-products
The corresponding lines in JacVec should similarly be fixed: https://github.com/JuliaDiff/SparseDiffTools.jl/blob/2dbe4206d107d55538eb72b33aca727169130849/src/differentiation/jaches_products.jl#L201

Besides, shouldn't auto_jacvec(!) be upstreamed to ForwardDiff? Strange to have to pull in a sparse-related package to get this functionality. JuliaDiff/ForwardDiff.jl#319

MWE:

using SparseDiffTools

function f!(dx, x)
    dx[1] = x[2] + x[3]
    dx[2] = x[1]
end

x = randn(3)
v = randn(3)
dx = zeros(2)

auto_jacvec!(dx, f!, x, v)

ChrisRackauckas commented 2 years ago

I think manual cache allocation is what needs to be done, because that's how the output vector would then be defined. Does that make this a documentation issue?

Besides, shouldn't auto_jacvec(!) be upstreamed to ForwardDiff? Strange to have to pull in a sparse-related package to get this functionality. https://github.com/JuliaDiff/ForwardDiff.jl/issues/319

Possibly, or it should be an AbstractDifferentation.jl interface piece, and all AD packages should then adopt that interface.

danielwe commented 2 years ago

A documentation fix would definitely make it easier to preallocate, which is useful regardless, but I think it's possible to make the allocating version work too. Just replace cache2 = similar(cache1) with

    cache2 = Dual{typeof(ForwardDiff.Tag(DeivVecTag(),eltype(dy))),eltype(dy),1}.(dy, ForwardDiff.Partials.(tuple.(dy))),

Here's the adaptation I landed on for my own use. I factored out the cache allocation into a function that can be used separately, making preallocation easy. If upstreamed to ForwardDiff this would of course turn into something like a ForwardDiff.JVPConfig type.

struct MyTag end

function jvp!(f!, dy, x, v, duals=nothing)
    if duals === nothing
        duals = jvpduals(dy, v, x)
    else # jvpduals returns initialized duals, so this init is only needed when preallocated
        duals.x .= ForwardDiff.Dual{_tagtype(x), eltype(x), 1}.(
            x, ForwardDiff.Partials.(tuple.(v))
        )
    end
    f!(duals.dy, duals.x)
    dy .= ForwardDiff.partials.(duals.dy, 1)
    return dy
end

function jvpduals(dy, x, v)
    dualdy = ForwardDiff.Dual{_tagtype(dy), eltype(dy), 1}.(
        dy, ForwardDiff.Partials.(tuple.(dy))
    )
    # This correctly initializes dualx for computing J_f(x) * v
    dualx = ForwardDiff.Dual{_tagtype(x), eltype(x), 1}.(
        x, ForwardDiff.Partials.(tuple.(v))
    )
    return (; dy=dualdy, x=dualx)
end

_tagtype(x) = typeof(ForwardDiff.Tag(MyTag(), eltype(x)))

ChrisRackauckas commented 2 years ago

That does seem like a good fix. Mind opening a PR?

danielwe commented 2 years ago

Been procrastinating on this, since my own version is still developing along with my project (including things like allocation-free 5-arg accumulating jvp, and consistent treatment of in-place vs. immutable versions), and I notice that the landscape of linear operator types is in rapid flux at the moment, at least on the SciML end of things (stoked for https://github.com/SciML/SciMLOperators.jl :fire:). So unless an improvement on this is urgent for anyone else I'll hold off a little and come back when it's more clear how things should look.

ChrisRackauckas commented 2 years ago

There's been a lot of motion in this ocean in the last few weeks. Indeed, This change might want to wait until the end of the month until SciMLOperators is complete and registered.

gdalle commented 3 months ago

Sparse differentiation is now a part of DifferentiationInterface, with automatic cache allocation and (tested) support for non-square Jacobians

JuliaDiff / SparseDiffTools.jl

In-place `*_jacvec!` assumes square jacobian #184