autodiff API for forward and reverse mode

fredo-dedup commented 10 years ago

@scidom, @mlubin, @kmsquire, @powerdistribution : Here I go, for a (probably biased) proposal of a common interface for the forward/reverse mode symbolic derivation function :

should take an Expression as input rather than a function, that seems more flexible, and a wrapper function using Base.uncompress_ast can still be build around it for inputs as functions. + it doesn't require a function creation on the part of the caller (useful if the function is called recursively for higher order derivation).
should output an expression, for similar reasons. But may be a composite type containing the expression and additional information would be useful
with a method=:forward or method=:reverse parameter to indicate which algo to use (and not limited to these 2 when new methods are implemented).
a list of pairs [Symbol, Value] to specify which variables should be derived against and an initial value, so as to know if they are scalars, vectors, etc.. and infer the type of all other variables
the name of the variable containing the result in the expression. We could have instead the algos take the last evaluated value in the expression but it may be turn out to be a too strong requirement when using the function recursively. Could be optionnal.

Name of the function : differentiate() ? diff(), ? derive() ?

mlubin commented 10 years ago

Expressions are the convenient input format for reverse mode but not for forward mode. It's far from trivial to convert a function into a single expression for the output from Base.uncompress_ast, and that's not even what's wanted when using dual numbers for forward mode. This is a tricky issue.

papamarkou commented 10 years ago

A couple of questions:

I guess it is unreasonable to use functions and expressions as input for forward and reverse mode respectively? If we want to have a function wrapper, with an option method=:forward or :reverse, then this unification is needed, am I following you @fredo-dedup? @mlubin, when you say it is far from trivial to convert a function to an expression, do you mean that it is in fact nearly impossible or it simply takes a lot of work?
@fredo-dedup could you please elaborate a bit more on the last two items of your list? I didn't understand well what you meant there.
differentiate() is the chosen name in Calculus and I like its explicitness, in comparison to diff(). derive() would be in my opinion too generic.

fredo-dedup commented 10 years ago

About item 4 : that parameter is meant to indicate which variables among all appearing in the expression are to appear in the gradient, since not all variables may be in that case. Example :

A = rand(4,4)

ex = :( y = sum(x * A) )

# get the expression evaluating d(ex) / dx
diff( ex, :y, x=0.)  

# get the expression evaluating d(ex) / dx (a scalar), and d(ex)/dA ( 4x4 matrix)
diff( ex, :y, x=0., A=zeros(4,4))

Giving initial values is necessary because the algo needs to know the types of all variables (even intermediate ones) to fetch the proper derivation rule, and this in turn requires to have parameter values to start from (no way to guess that x is scalar, not a Matrix, in the example). But may be there are workarounds...

About item 5 : this relies on a hunch. We could require that the last evaluated statement gives the output value of interest and that should be fine in most cases. But what if we do a first derivation pass producing the value of d(ex)/dx and d(ex)/dy and that each of them appear somewhere in the output expression. How can we make a second pass (to calculate d2(ex)/dx2, d2(ex)/dx.dy, etc..) without reshuffling statements so that they are exactly at the end ? That would be clumsy, hence the idea of specifying this to the derivation function. But it could be optionnal.

@mlubin : i'll get myself up to speed with forward mode to better understand why it's not even a good idea to start with an expression. My initial reasoning was that since it is symbolic processing, an expression was a sufficient starting point.

mlubin commented 10 years ago

@scidom, it takes a lot of work. This is essentially the source transformation approach for AD. It's a very valuable approach, but it doesn't make any sense at all to do this if you're performing forward-mode AD.

I'm not an expert in this area, but it seems AD has vastly different inputs and outputs depending on the context, so it may be a bit overly ambitious to write a single API for all AD algorithms before we have a solid interface for reverse mode and forward mode separately.

papamarkou commented 10 years ago

@mlubin, @fredo-dedup thank you for the helpful comments and illustrations. I think that you are right Miles, it is a premature goal to unify the API across the AD algorithms. For one thing, my assessment is that we luck the expertise on the field.

My proposed plan would be to have all the AD algorithms in this package with separate interfaces and in a functional state. As our understanding progresses, then we can think how to improve each AD mode implementation and even later we can unify interfaces. As for myself, I prefer to read Griewank's book, which I already started doing, before I dive into further AD coding (apart from the "playground" naive forward AD coding I have done, which simply offers a functional tool for now).

fredo-dedup commented 10 years ago

I agree that it seems more reasonable.

Would you agree then if I create a define a function called backwarddiff() for my existing code in the reverse folder, and then have the Autodiff package export it ? I will also add some tests in the test folder.

papamarkou commented 10 years ago

Yes, Frederic, my view is that we'd rather make available the functionality we have already coded, so that sounds a good idea to me, go ahead. Which name do you like more, reversediff() or backwarddiff()? I prefer the former, but if the latter sounds better to you, that's fine of course.

StefanKarpinski commented 10 years ago

Since it's most commonly referred to as "reverse mode" I think that reversediff would be clearer.

papamarkou commented 10 years ago

@fredo-dedup, following the conversation from the relevant METADATA thread, and in order to agree with what the rest of the Julian developers wished for, I am going to rename this repository and register it as FrowardDiff, so you may want to put your work on a standalone ReverseDiffSource as it will fit better there.

papamarkou commented 10 years ago

I will close this redundant issue now, since we decided on a different strategy (i.e. to split the autodiff approaches).

JuliaDiff / ForwardDiff.jl

autodiff API for forward and reverse mode #7