Closed fredo-dedup closed 10 years ago
Expressions are the convenient input format for reverse mode but not for forward mode. It's far from trivial to convert a function into a single expression for the output from Base.uncompress_ast
, and that's not even what's wanted when using dual numbers for forward mode. This is a tricky issue.
A couple of questions:
method=:forward
or :reverse
, then this unification is needed, am I following you @fredo-dedup? @mlubin, when you say it is far from trivial to convert a function to an expression, do you mean that it is in fact nearly impossible or it simply takes a lot of work?differentiate()
is the chosen name in Calculus
and I like its explicitness, in comparison to diff()
. derive()
would be in my opinion too generic.About item 4 : that parameter is meant to indicate which variables among all appearing in the expression are to appear in the gradient, since not all variables may be in that case. Example :
A = rand(4,4)
ex = :( y = sum(x * A) )
# get the expression evaluating d(ex) / dx
diff( ex, :y, x=0.)
# get the expression evaluating d(ex) / dx (a scalar), and d(ex)/dA ( 4x4 matrix)
diff( ex, :y, x=0., A=zeros(4,4))
Giving initial values is necessary because the algo needs to know the types of all variables (even intermediate ones) to fetch the proper derivation rule, and this in turn requires to have parameter values to start from (no way to guess that x is scalar, not a Matrix, in the example). But may be there are workarounds...
About item 5 : this relies on a hunch. We could require that the last evaluated statement gives the output value of interest and that should be fine in most cases. But what if we do a first derivation pass producing the value of d(ex)/dx and d(ex)/dy and that each of them appear somewhere in the output expression. How can we make a second pass (to calculate d2(ex)/dx2, d2(ex)/dx.dy, etc..) without reshuffling statements so that they are exactly at the end ? That would be clumsy, hence the idea of specifying this to the derivation function. But it could be optionnal.
@mlubin : i'll get myself up to speed with forward mode to better understand why it's not even a good idea to start with an expression. My initial reasoning was that since it is symbolic processing, an expression was a sufficient starting point.
@scidom, it takes a lot of work. This is essentially the source transformation approach for AD. It's a very valuable approach, but it doesn't make any sense at all to do this if you're performing forward-mode AD.
I'm not an expert in this area, but it seems AD has vastly different inputs and outputs depending on the context, so it may be a bit overly ambitious to write a single API for all AD algorithms before we have a solid interface for reverse mode and forward mode separately.
@mlubin, @fredo-dedup thank you for the helpful comments and illustrations. I think that you are right Miles, it is a premature goal to unify the API across the AD algorithms. For one thing, my assessment is that we luck the expertise on the field.
My proposed plan would be to have all the AD algorithms in this package with separate interfaces and in a functional state. As our understanding progresses, then we can think how to improve each AD mode implementation and even later we can unify interfaces. As for myself, I prefer to read Griewank's book, which I already started doing, before I dive into further AD coding (apart from the "playground" naive forward AD coding I have done, which simply offers a functional tool for now).
I agree that it seems more reasonable.
Would you agree then if I create a define a function called backwarddiff()
for my existing code in the reverse folder, and then have the Autodiff package export it ? I will also add some tests in the test folder.
Yes, Frederic, my view is that we'd rather make available the functionality we have already coded, so that sounds a good idea to me, go ahead. Which name do you like more, reversediff()
or backwarddiff()
? I prefer the former, but if the latter sounds better to you, that's fine of course.
Since it's most commonly referred to as "reverse mode" I think that reversediff
would be clearer.
@fredo-dedup, following the conversation from the relevant METADATA thread, and in order to agree with what the rest of the Julian developers wished for, I am going to rename this repository and register it as FrowardDiff
, so you may want to put your work on a standalone ReverseDiffSource
as it will fit better there.
I will close this redundant issue now, since we decided on a different strategy (i.e. to split the autodiff approaches).
@scidom, @mlubin, @kmsquire, @powerdistribution : Here I go, for a (probably biased) proposal of a common interface for the forward/reverse mode symbolic derivation function :
Base.uncompress_ast
can still be build around it for inputs as functions. + it doesn't require a function creation on the part of the caller (useful if the function is called recursively for higher order derivation).method=:forward
ormethod=:reverse
parameter to indicate which algo to use (and not limited to these 2 when new methods are implemented).Name of the function :
differentiate()
?diff()
, ?derive()
?