brianguenter / FastDifferentiation.jl

Fast derivative evaluation
MIT License
116 stars 4 forks source link

Some issues #13

Closed baggepinnen closed 1 year ago

baggepinnen commented 1 year ago

Hello and thanks for this interesting package! I have been exploring a bit and ran into a couple of issues, the biggest one being that I cannot get sparse_hessian below to work. Some other issues I ran into while trying are indicated with # Error below.

using FastDifferentiation

x = make_variables(:x, 2)
mu = make_variables(:mu, 2)

# x'mu          # Error
# dot(x, mu)    # Error
ex = sum(abs2, x .* mu)

h = sparse_hessian(ex, x)
# hs = sparse_hessian(ex, x) # Error

# fun = make_function(h, x, mu) # Error (it would be nice if functions could take more than one argument)
fun = make_function(h, [x; mu])
brianguenter commented 1 year ago

Found the errors. When you create a hessian the code currently requires you to list all the variables of the function.

if you change

h = sparse_hessian(ex, x)

to this

h = sparse_hessian(ex, [x;mu])

it works. The code requires you to list all the variables of the function when taking the hessian. I don't think this would be hard to change - is it common to want to compute second partials just wrt to a subset of the variables?

this doesn't work

x'mu

because I didn't define Base.adjoint for the Node type. It works with adjoint defined.

brianguenter commented 1 year ago

I fixed the adjoint bug and made a new release 0.2.1. If you grab that x'mu should work.

dot does not work for reasons that are not clear. It's failing when trying to iterate a Node object but when I define Base.iterate for Node dot crashes with a low level gc failure. This may take some digging.

still looking at the hessian issue.

brianguenter commented 1 year ago

fixed the sparse hessian so it behaves the same as hessian when taking derivatives wrt just some of the variables of the function. I'll make another patch release that has this fix, 0.2.2.

baggepinnen commented 1 year ago

is it common to want to compute second partials just wrt to a subset of the variables?

Yes, here are a few use cases

brianguenter commented 1 year ago

Thanks for the information about the use cases. The dense hessian already supported this - I made a new v0.2.2 which has a fix so sparse_hessian does as well. Try it and let me know if it works for you..

All of your bugs are fixed in v0.2.2 except for dot(x,y). Still working on that one, dot is doing something weird which I haven't figured out yet.

baggepinnen commented 1 year ago

Thanks for your quick fixes!

The dense hessian already supported this - I made a new v0.2.2 which has a fix so sparse_hessian does as well. Try it and let me know if it works for you..

I can't quite seem to get it to work. This is on FastDifferentiation#main:

using FastDifferentiation
x = make_variables(:x, 2)
mu = make_variables(:mu, 2)
ex = sum(abs2, x .* mu)
h = sparse_hessian(ex, x)
fun = make_function(h, x, mu)
julia> fun = make_function(h, x, mu)
ERROR: MethodError: no method matching make_function(::SparseArrays.SparseMatrixCSC{FastDifferentiation.Node, Int64}, ::Vector{FastDifferentiation.Node}, ::Vector{FastDifferentiation.Node})

Closest candidates are:
  make_function(::AbstractArray{T}, ::AbstractVector{S}; in_place) where {T<:FastDifferentiation.Node, S<:FastDifferentiation.Node}
   @ FastDifferentiation ~/.julia/packages/FastDifferentiation/Ays9i/src/CodeGeneration.jl:85

Stacktrace:
 [1] top-level scope
   @ REPL[21]:1
brianguenter commented 1 year ago

There is a newer release which has all the fixes but Tagbot is flaking out, crashing and not installing the latest version. I'll work on it tomorrow.

baggepinnen commented 1 year ago

I used the main branch, so I was running the latest code in the repository when I tested

brianguenter commented 1 year ago

Looked at your code again. Everything works except make_function(h, x, mu). I intentionally did not implement this (sorry for not making that clear). There is an issue with typing the function arguments.

When possible I make argument types explicit, as a form of documentation to the user. This is especially handy when using VSCode where intellisense hover shows the function signature.

make_function could be:

make_function(h::SparseMatrixCSC{T}, vec:AbstractVector{S}; in_place=true) where {T<:Node,S<:Node} = "worked"
make_function(h::SparseMatrixCSC{T}, vec:AbstractVector{S},vec2:AbstractVector{S}; in_place=true) where {T<:Node,S<:Node} = "worked"

This would work. But will there be cases when people might want to combine more than 2 vectors? If it's defined for 1 and 2 vectors users might naturally suspect they could pass any number of vectors, which would fail with just these two definitions.

Instead you'd want something more like this:

make_function(h::SparseMatrixCSC{T}, vecargs::AbstractVector{S}...; in_place=true) where {T<:Node,S<:Node} = "worked"

But this only works if the element type of all the vecargs is the same. If you make a vector like this:

julia> z = [x[1],x[2]]
2-element Vector{Node{Symbol, 0}}:
 x1
 x2

then the function call make_function(h,z,mu) fails:

julia> make_function(h,z,mu)
ERROR: MethodError: no method matching make_function(::SparseMatrixCSC{Node, Int64}, ::Vector{Node{Symbol, 0}}, ::Vector{Node})

Closest candidates are:
  make_function(::SparseMatrixCSC{T}, ::AbstractVector{S}...; in_place) where {T<:Node, S<:Node}

If you define make_function like this:

make_function(h::SparseMatrixCSC{T}, vecargs::AbstractVector{Node}...; in_place=true) where {T<:Node,S<:Node} = "worked"

make_function(h,z,mu) fails again:

julia> make_function(h,z,mu)
ERROR: MethodError: no method matching make_function(::SparseMatrixCSC{Node, Int64}, ::Vector{Node{Symbol, 0}}, ::Vector{Node})

Closest candidates are:
  make_function(::SparseMatrixCSC{T}, ::AbstractVector{Node}...; in_place) where T<:Node

The only definition I've found that does work is this:

make_function(h::SparseMatrixCSC{T}, vecargs::AbstractVector...; in_place=true) where {T<:Node,S<:Node} = "worked"

But then the user is left guessing what the element type of the vectors in vecargs should be.

Do you know how to define the argument types for a varargs argument so this will work? I want the user to be able to tell by looking at the function signature that each of the vecargs has to be AbstractVector{<:Node}.

baggepinnen commented 1 year ago

When possible I make argument types explicit, as a form of documentation to the user. This is especially handy when using VSCode where intellisense hover shows the function signature.

In Julia. typed input arguments are not intended for documentation, they are for dispatch. vscode shows the docstring as well, so you could indicate one signature in the docstring but actually implement a less restrictive signature in the code.

But this only works if the element type of all the vecargs is the same.

This is probably okay for most use cases, but the restriction is kind of artificial so it feels unnecessary.

If you define make_function like this: ... make_function(h,z,mu) fails again:

The proper way to write this is

make_function(h::SparseMatrixCSC{T}, vecargs::AbstractVector{<: Node}...; in_place=true) where {T<:Node,S<:Node} = "worked"

the key point is the <: in AbstractVector{<: Node}

But then the user is left guessing what the element type of the vectors in vecargs should be.

Not if you say what the element type is expected to be in the docstring. For example, the docstring signature can be

make_function(::SparseMatrixCSC{T}, ::Vector{Node}...; in_place)

while the actual implementation is

make_function(::SparseMatrixCSC{T}, ::AbstractVector{<:Node}...; in_place)
brianguenter commented 1 year ago

You can now pass in a list of vectors for input variables.

brianguenter commented 1 year ago

if you don't find any other problems I'll close this issue in a day or two.

brianguenter commented 1 year ago

Haven't heard anything so closing this issue.

baggepinnen commented 1 year ago

Sorry for taking a while to get back to this. On the latest released version, I tried to create a function of several arguments, but it does not appear to work:

using FastDifferentiation

x = make_variables(:x, 2)
mu = make_variables(:mu, 2)
ex = sum(abs2, x .* mu)
hs = sparse_hessian(ex, x)
fun = make_function(hs, x, mu; in_place=false)
# Test
xv = randn(size(x))
muv = randn(size(mu))
fun(xv, muv)

If we inspect the expression of the generated function, it sill has a single argument only:

julia> fun
RuntimeGeneratedFunction(#=in FastDifferentiation=#, #=using FastDifferentiation=#, :((input_variables,)->begin
          #= /home/fredrikb/.julia/packages/FastDifferentiation/vh3e2/src/CodeGeneration.jl:133 =#
          begin