auto-differentiation / xad

Comprehensive automatic differentiation in C++
https://auto-differentiation.github.io
GNU Affero General Public License v3.0
223 stars 17 forks source link

Eigen Support #90

Open auto-differentiation-dev opened 9 months ago

auto-differentiation-dev commented 9 months ago

XAD should support Eigen data types and related linear algebra operations, calculating values and derivatives efficiently.

Ideally, simply using an XAD type within Eigen vectors or matrices should work out of the box and be efficient. Given that both Eigen and XAD are using expression templates, it may require some traits and template specialisations to make this work seamlessly - and efficiently.

aia29 commented 9 months ago

As for now, I find a combination of std::vector<AD> and Eigen::Map be the most efficient way of coupling XAD and Eigen. For full Eigen support, the functions like registerInputs, registerOutput, value and derivative must accept variables of Eigen matrix and mapped matrix types. Below is an example of using XAD+Eigen I came up with:

#include <XAD/XAD.hpp>
#include <Eigen/Dense>
#include <iostream>

template <typename T>
T norm(const std::vector<T> &x) {
    T r = 0.0;
    for (T v : x) {
        r = r + v * v;
    }
    r = sqrt(r);
    return r;
}

template <typename T>
T normEigen(const std::vector<T> &x) {
    using namespace Eigen;
    typedef Matrix<T, 1, Dynamic> VectorType;
    typedef Map<const VectorType> MapConstVectorType;

    MapConstVectorType xmap(x.data(), x.size());

    T r = xmap.norm();
    return r;
}

int main() {
    // types for first-order adjoints in double precision
    using mode = xad::adj<double>;
    using Adouble = mode::active_type;
    using Tape = mode::tape_type;

    // variables 
    std::vector<Adouble> x = {1.0, 1.5, 1.3, 1.2};
    Adouble y;

    // start taping
    Tape tape;
    tape.registerInputs(x);

    // Manual norm
    tape.newRecording();

    y = norm(x);

    tape.registerOutput(y);
    derivative(y) = 1.0;     // seed output adjoint
    tape.computeAdjoints();  // roll-back tape

    std::cout << "Using manual: \n"
              << "y      = " << value(y) << "\n"
              << "dy/dx0 = " << derivative(x[0]) << "\n"
              << "dy/dx1 = " << derivative(x[1]) << "\n"
              << "dy/dx2 = " << derivative(x[2]) << "\n"
              << "dy/dx3 = " << derivative(x[3]) << "\n\n";

    // Eigen norm
    tape.newRecording();

    y = normEigen(x);

    tape.registerOutput(y);
    derivative(y) = 1.0;     // seed output adjoint
    tape.computeAdjoints();  // roll-back tape

    std::cout << "Using Eigen: \n"
              << "y      = " << value(y) << "\n"
              << "dy/dx0 = " << derivative(x[0]) << "\n"
              << "dy/dx1 = " << derivative(x[1]) << "\n"
              << "dy/dx2 = " << derivative(x[2]) << "\n"
              << "dy/dx3 = " << derivative(x[3]) << "\n";
}
auto-differentiation-dev commented 9 months ago

That's interesting, thanks for sharing this.

Note that registerInputs and registerOutputs have overloads taking an iterator pair, which should work well enough with Eigen vector types using their STL begin and end members. For Eigen matrices, using matrix.reshaped().begin() and matrix.reshaped().end() should also work fine, since reshaped does not copy the elements in general. Based on this, it should be easy enough to add a simple overloads for the input/output registration functions to the Tape.

For the value and derivative functions, that is going to be more difficult, as they need to return full matrices that hold references to the original data (not copies). Using a Map with strides, it should be easy enough for the values. For derivatives in adjoint mode, that's more tricky and it might need other tricks / proxies to return an Eigen matrix of references into the XAD derivatives vector.

Did you try more complex matrix/vector operations than norm on the mapped type in your experiments? To make sure the expression templates create no issues...

aia29 commented 9 months ago

As you said registerInputs and registerOutputs work "out of the box", while value and derivative required some workaround:

#include <XAD/XAD.hpp>
#include <Eigen/Dense>
#include <iostream>

int main() {
    // types for first-order adjoints in double precision
    using mode = xad::adj<double>;
    using Adouble = mode::active_type;
    using Tape = mode::tape_type;

    // variables
    typedef Eigen::Matrix<Adouble, Eigen::Dynamic, Eigen::Dynamic> MatrixType;
    MatrixType x(2, 2);
    x(0, 0) = 1.0;
    x(0, 1) = 1.5;
    x(1, 0) = 1.3;
    x(1, 1) = 1.2;
    // Adouble y;
    MatrixType y;

    // start taping
    Tape tape;
    tape.registerInputs(x.reshaped().begin(), x.reshaped().end());

    // Manual norm
    tape.newRecording();

    y = x.inverse();

    for(Adouble &yy : y.reshaped()) {
        tape.registerOutput(yy);
        derivative(yy) = 1.0;     // seed output adjoint
    }
    tape.computeAdjoints();  // roll-back tape

    for(Adouble &xx : x.reshaped()) {
        std::cout << "x      = " << value(xx) << "\n";
    }

    for(Adouble &yy : y.reshaped()) {
        std::cout << "y      = " << value(yy) << "\n";
    }

    std::cout << "Derivative of y = inv(x): \n";
    for(Adouble &xx : x.reshaped()) {
        std::cout << "dy/dx  = " << derivative(xx) << "\n";
    }
}

I was able to do norms, matmuls and inverse operations without much efforts. Although, I have not checked the computational efficiency of computing the derivative of the inverse operation. But, I did the numerical check and I got the same results from PyTorch, which is a good sign.