autodiff / autodiff

automatic differentiation made easier for C++
https://autodiff.github.io
MIT License
1.58k stars 168 forks source link

Forward mode Hessian #120

Open worc4021 opened 4 years ago

worc4021 commented 4 years ago

Hello,

I saw that there is a documented hessian driver for the reverse mode. And as I checked through the forward eigen header I found the counterpart. However, I cannot get it to work. My example is trivial:

#include <iostream>

using namespace std;

#include <Eigen/Core>
using namespace Eigen;

#include <autodiff/forward.hpp>
#include <autodiff/forward/eigen.hpp>

using namespace autodiff;

dual fun(const VectorXdual& u) {
    return sin(2*u[0]+exp(u[1]));
}

int main() {

    VectorXdual x(2);
    x[0] = 1.;
    x[1] = 1.;

    dual fx;
    VectorXd dfx;
    MatrixXd ddfx = hessian(fun, wrt(x), at(x), fx, dfx);

    return 0;
}

But this won't compile, I get the following issues with gcc 9:

gcc-9 -fpermissive HessianExample1.cpp -c -I/Users/Manuel/Documents/Development/AutoDiff/autodiff -I/usr/local/Cellar/eigen/3.3.7/include/eigen3 -std=c++17
In file included from HessianExample1.cpp:9:
/Users/Manuel/Documents/Development/AutoDiff/autodiff/autodiff/forward/eigen.hpp: In instantiation of 'Eigen::MatrixXd autodiff::forward::hessian(const Function&, Wrt&&, Args&&, Result&, Gradient&) [with Function = autodiff::forward::Dual<double, double>(const Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&); Wrt = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>; Args = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>; Result = autodiff::forward::Dual<double, double>; Gradient = Eigen::Matrix<double, -1, 1>; Eigen::MatrixXd = Eigen::Matrix<double, -1, -1>]':
HessianExample1.cpp:25:56:   required from here
/Users/Manuel/Documents/Development/AutoDiff/autodiff/autodiff/forward/eigen.hpp:318:52: error: request for member 'grad' in 'u.grad', which is of non-class type 'double'
  318 |                     H(jj, ii) = H(ii, jj) = u.grad.grad;
      |                                             ~~~~~~~^~~~
/Users/Manuel/Documents/Development/AutoDiff/autodiff/autodiff/forward/eigen.hpp:113:6: warning: 'void autodiff::forward::detail::forEach(Tuple&&, Callable&&) [with Tuple = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>&; Callable = autodiff::forward::hessian(const Function&, Wrt&&, Args&&, Result&, Gradient&) [with Function = autodiff::forward::Dual<double, double>(const Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&); Wrt = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>; Args = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>; Result = autodiff::forward::Dual<double, double>; Gradient = Eigen::Matrix<double, -1, 1>; Eigen::MatrixXd = Eigen::Matrix<double, -1, -1>]::<lambda(auto:6&&)>]', declared using local type 'autodiff::forward::hessian(const Function&, Wrt&&, Args&&, Result&, Gradient&) [with Function = autodiff::forward::Dual<double, double>(const Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&); Wrt = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>; Args = std::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1>&>; Result = autodiff::forward::Dual<double, double>; Gradient = Eigen::Matrix<double, -1, 1>; Eigen::MatrixXd = Eigen::Matrix<double, -1, -1>]::<lambda(auto:6&&)>', is used but never defined [-fpermissive]
  113 | void forEach(Tuple&& tuple, Callable&& callable)
      |      ^~~~~~~
make: *** [HessianExample1.o] Error 1

And only one of them with the macos clang compiler:

c++ HessianExample1.cpp -c -I/Users/Manuel/Documents/Development/AutoDiff/autodiff -I/usr/local/Cellar/eigen/3.3.7/include/eigen3 -std=c++17
In file included from HessianExample1.cpp:9:
/Users/Manuel/Documents/Development/AutoDiff/autodiff/autodiff/forward/eigen.hpp:318:51: error: member reference base type 'double' is not a structure or union
                    H(jj, ii) = H(ii, jj) = u.grad.grad;
                                            ~~~~~~^~~~~
HessianExample1.cpp:25:21: note: in instantiation of function template specialization 'autodiff::forward::hessian<autodiff::forward::Dual<double, double> (const
      Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1> &), std::__1::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1> &>,
      std::__1::tuple<Eigen::Matrix<autodiff::forward::Dual<double, double>, -1, 1, 0, -1, 1> &>, autodiff::forward::Dual<double, double>, Eigen::Matrix<double, -1, 1, 0,
      -1, 1> >' requested here
    MatrixXd ddfx = hessian(fun, wrt(x), at(x), fx, dfx);
                    ^
1 error generated.
make: *** [HessianExample1.o] Error 1

I've seen that the hessian implementation looks completely different and more elegant in PR69/100. So perhaps not worth addressing for now. But was this working at some points?

Thank you, Manuel

allanleal commented 4 years ago

Hi Manuel,

Yes, PR #69 has indeed a revised implementation for Hessian using forward mode.

The issue in your code is that you need to use at least a second-order dual number. You are using a first-order one, so only first-order derivatives are possible.

I'll come back soon with the required changes in your example.

allanleal commented 4 years ago

Below is an example:

// C++ includes
#include <iostream>
using namespace std;

// Eigen includes
#include <Eigen/Core>
using namespace Eigen;

// autodiff include
#include <autodiff/forward.hpp>
#include <autodiff/forward/eigen.hpp>
using namespace autodiff;

// Define a 2nd order dual type using HigherOrderDual<N> construct.
using dual2nd = HigherOrderDual<2>;

// The scalar function for which the Hessian is needed
dual2nd f(const VectorXdual2nd& x, const VectorXdual2nd& y)
{
    return sqrt(x.cwiseProduct(x).sum() + y.cwiseProduct(y).sum());
}

int main()
{
    VectorXdual2nd x(3); // the input vector x with 3 variables
    x << 1, 2, 3; // x = [1, 2, 3]

    VectorXdual2nd y(2); // the input vector y with 2 variables
    y << 4, 5; // x = [1, 2, 3]

    dual2nd u; // the output scalar u = f(x) evaluated together with Hessian below
    VectorXdual g; // gradient of f(x) evaluated together with Hessian below

    MatrixXd H = hessian(f, wrtpack(x, y), at(x, y), u, g); // evaluate the function value u and its Hessian matrix H

    cout << "u = " << u << endl; // print the evaluated output u
    cout << "Hessian = \n" << H << endl; // print the evaluated Hessian matrix H
    cout << "g =\n" << g << endl; // print the evaluated gradient vector g = [du/dx, du/dy]
}
allanleal commented 4 years ago

After PR #69 is merged, a few API breaks will happen (sorry!). But these should be minimal (such as the use of wrt instead of wrtpack).