mlpack / ensmallen

A header-only C++ library for numerical optimization --
http://ensmallen.org
Other
742 stars 120 forks source link

feature request: automatic differentiation #307

Closed emogenet closed 3 years ago

emogenet commented 3 years ago

The main reason I can't use ensmallen right now is the lack of automatic differentiation for my objective function.

My function is C1 but actually implementing differentiation for it is a giant PITN, especially because it calls on relatively complex opencv resampling code.

The only quick work-around for me is to implement numerical gradient computation, which - for example - ceres provides natively (along other methods such as automatic template-based differentiation).

I suspect I'm not the only one in this situation, and it would really be nice of ensmallen did provide, at the very least, an automated way to numerically compute the gradient (and if it already does, great but I couldn't find it).

rcurtin commented 3 years ago

It would be really awesome to have automatic differentiation support for exactly the reasons you mentioned! But, unfortunately, it would be a huge amount of effort and scope creep for this project to implement that. If someone ever came along and wanted to implement an AD framework that would be great and I think we should include it (or perhaps package it as its own project), but as far as I know nobody is working on such a thing.

However, there might be an existing C++ autodiff package that can work with Armadillo matrices (and thus with ensmallen objective functions). Going off of this site, it seems like FunG and FunCy might work with Armadillo matrices? I don't see any exact examples, but maybe you could play with it and make it work, or open a Github issue there and see if you can the maintainer can provide an Armadillo example?

There may also be other AD libraries on autodiff.org that fit the bill---I am not sure. Personally I haven't [yet] needed AD for anything I've done with ensmallen.

Theoretically, if you have an objective function o(const arma::mat&), and an AD toolkit that can produce the Jacobian of o() with respect to the input matrix, then you should be able to package that into a class like ensmallen requires, and use the AD-generated Jacobian as the Gradient() implementation.

I hope this helps! Sorry we don't have a plug-and-play solution here.

jonpsy commented 3 years ago

Great initiative! Perhaps we could take inspiration from the new Flashlight library from FB-AI? Since they're the progenitor of this concept, I'd expect them to nail this concept in C++ perfectly. Further, I recall there were some attempts at this in the shogun library as well. In that blog, Gil said:

Nowadays, functionalities such as auto differentiation are frequently a requirement when working with kernels, see for example GPFlow. However, in Shogun we still rely on manual implementations of gradient calculations, which is both error prone and time consuming.

I think the same is true for mlpack as well. Although this means we will have to build an entire backend using graphs, treat everything as Tensor and work our way up from there. Let me know what you guys think?

mlpack-bot[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1: