JuliaDiff / ForwardDiff.jl

Forward Mode Automatic Differentiation for Julia
Other
887 stars 142 forks source link

move forward logic from Optim to here #10

Closed mlubin closed 10 years ago

mlubin commented 10 years ago

The dual numbers logic at https://github.com/JuliaOpt/Optim.jl/blob/master/src/autodiff.jl and https://github.com/EconForge/NLsolve.jl/blob/master/src/autodiff.jl should be moved here. What sort of interface should we provide? The tricky part is the memory management and allowing users to avoid allocating new vectors on each evaluation.

papamarkou commented 10 years ago

Sure Miles, let's move them here. I will look at your code today to think of how we will organize the interface.

papamarkou commented 10 years ago

I thought about what you suggested Miles. Here are my thoughts. I would suggest that we merge your code asap. You can then call it from Optim with using ForwardDiff. I would further suggest that we delay for a bit more the unification of our interface since the ForwarDiff package is still under heavy development. Let me share with you the background reading I am doing since we work together on forward mode and dual number-based autodiff:

  1. I found these two wonderful blog posts for computing the exact n-th derivatives (easy to implement for function f:R->R) rather than just the first order derivative: http://jliszka.github.io/2013/10/24/exact-numeric-nth-derivatives.html http://duaeliststudios.com/automatic-differentiation-with-dual-numbers/
  2. I am reading this article which was published by some IBM researches, explaining how to exploit matrix algebra for efficient automatic differentiation (I am focusing on the forward mode implementation, which is within the scope of the current package): http://link.springer.com/chapter/10.1007%2F978-3-642-30023-3_7 This paper seems amazing to me as it offers efficient matrix-centric autodiff algorithms (we can use BLAS here). In any case, although I haven't added much code yet, I am doing work on this in silence, so let's spare us some more time so that I can implement the above methods. Then we will have a broader view of the available routines in order to better design the interface :)

I will tidy up the tests for the existing code (related to issue #9) over the weekend, and you can add your functions now without worrying so much about unifying the API at this stage, what do you think?

mlubin commented 10 years ago

I can see these three approaches: 1) using the DualNumbers package directly 2) higher-order derivatives (using one of the polynomial packages?) 3) matrix-based forward AD

as coexisting in ForwardDiff. All three are useful in different contexts.

papamarkou commented 10 years ago

I agree with you Miles. We can have all these three coexisting in ForwardDiff. Do you have push access to ForwardDiff? If not, I can add you as collaborator when I get back to my computer. Is it a good plan to add your autodiff functions here with an optional named argument dtype::Symbol=:dual? The other values for the dtype option would by matrix, polynomial (possibly classic too if we maintain the existing old fashioned approach for comparison purposes as a fourth option).

papamarkou commented 10 years ago

Miles, I reorganized the file hierarchy based on our discussion here, allowing for 4 forward AD approaches to coexist (the 3 you mentioned plus the existing one in the package based on types specifically for forward AD). Each of the three approaches goes on a separate folder in src. Your work on forward AD using dual numbers resides in src/dual_fad. I kept your function names as autodiff. If later on we see that there is overlap with the other approaches, then we can rethink if we need to change your function names.

You will notice that your code is in two files, the univariate_range.jl wich holds functions that map on univariate ranges (f:R^n->R), which is your code taken from Optim.jl/src/autodiff.jl and the multivariate_range.jl which holds functions that map on multivariate ranges (f:R^n->R^m), which is your code taken from NLsolve.jl/src/ autodiff.jl.

Your code has been copied and is intact apart from one change I had to make; the second autodiff function in each of these files return f, g! and f!, g! instead of the respective DifferentiableFunction(f,g!) and DifferentiableMultivariateFunction(f!,g!). The reason for this choice is to avoid making Optim and NLSolve package dependencies (so as to call DifferentiableFunction and DifferentiableMultivariateFunction). It is an one-liner to add to Optim and NLSolve a wrapper that wraps the output of autodiff.

I will close this issue now, since it seems resolved, but if you want to make any changes please do not hesitate to do so.