JuliaDiff / ForwardDiff.jl

Forward Mode Automatic Differentiation for Julia
Other
887 stars 142 forks source link

autodiff API for forward and reverse mode #7

Closed fredo-dedup closed 10 years ago

fredo-dedup commented 10 years ago

@scidom, @mlubin, @kmsquire, @powerdistribution : Here I go, for a (probably biased) proposal of a common interface for the forward/reverse mode symbolic derivation function :

Name of the function : differentiate() ? diff(), ? derive() ?

mlubin commented 10 years ago

Expressions are the convenient input format for reverse mode but not for forward mode. It's far from trivial to convert a function into a single expression for the output from Base.uncompress_ast, and that's not even what's wanted when using dual numbers for forward mode. This is a tricky issue.

papamarkou commented 10 years ago

A couple of questions:

fredo-dedup commented 10 years ago

About item 4 : that parameter is meant to indicate which variables among all appearing in the expression are to appear in the gradient, since not all variables may be in that case. Example :

A = rand(4,4)

ex = :( y = sum(x * A) )

# get the expression evaluating d(ex) / dx
diff( ex, :y, x=0.)  

# get the expression evaluating d(ex) / dx (a scalar), and d(ex)/dA ( 4x4 matrix)
diff( ex, :y, x=0., A=zeros(4,4))

Giving initial values is necessary because the algo needs to know the types of all variables (even intermediate ones) to fetch the proper derivation rule, and this in turn requires to have parameter values to start from (no way to guess that x is scalar, not a Matrix, in the example). But may be there are workarounds...

About item 5 : this relies on a hunch. We could require that the last evaluated statement gives the output value of interest and that should be fine in most cases. But what if we do a first derivation pass producing the value of d(ex)/dx and d(ex)/dy and that each of them appear somewhere in the output expression. How can we make a second pass (to calculate d2(ex)/dx2, d2(ex)/dx.dy, etc..) without reshuffling statements so that they are exactly at the end ? That would be clumsy, hence the idea of specifying this to the derivation function. But it could be optionnal.

@mlubin : i'll get myself up to speed with forward mode to better understand why it's not even a good idea to start with an expression. My initial reasoning was that since it is symbolic processing, an expression was a sufficient starting point.

mlubin commented 10 years ago

@scidom, it takes a lot of work. This is essentially the source transformation approach for AD. It's a very valuable approach, but it doesn't make any sense at all to do this if you're performing forward-mode AD.

I'm not an expert in this area, but it seems AD has vastly different inputs and outputs depending on the context, so it may be a bit overly ambitious to write a single API for all AD algorithms before we have a solid interface for reverse mode and forward mode separately.

papamarkou commented 10 years ago

@mlubin, @fredo-dedup thank you for the helpful comments and illustrations. I think that you are right Miles, it is a premature goal to unify the API across the AD algorithms. For one thing, my assessment is that we luck the expertise on the field.

My proposed plan would be to have all the AD algorithms in this package with separate interfaces and in a functional state. As our understanding progresses, then we can think how to improve each AD mode implementation and even later we can unify interfaces. As for myself, I prefer to read Griewank's book, which I already started doing, before I dive into further AD coding (apart from the "playground" naive forward AD coding I have done, which simply offers a functional tool for now).

fredo-dedup commented 10 years ago

I agree that it seems more reasonable.

Would you agree then if I create a define a function called backwarddiff() for my existing code in the reverse folder, and then have the Autodiff package export it ? I will also add some tests in the test folder.

papamarkou commented 10 years ago

Yes, Frederic, my view is that we'd rather make available the functionality we have already coded, so that sounds a good idea to me, go ahead. Which name do you like more, reversediff() or backwarddiff()? I prefer the former, but if the latter sounds better to you, that's fine of course.

StefanKarpinski commented 10 years ago

Since it's most commonly referred to as "reverse mode" I think that reversediff would be clearer.

papamarkou commented 10 years ago

@fredo-dedup, following the conversation from the relevant METADATA thread, and in order to agree with what the rest of the Julian developers wished for, I am going to rename this repository and register it as FrowardDiff, so you may want to put your work on a standalone ReverseDiffSource as it will fit better there.

papamarkou commented 10 years ago

I will close this redundant issue now, since we decided on a different strategy (i.e. to split the autodiff approaches).