joaoleal / CppADCodeGen

Source Code Generation for Automatic Differentiation using Operator Overloading
Other
171 stars 38 forks source link

Higher order derivatives? #6

Closed aespielberg closed 6 years ago

aespielberg commented 6 years ago

Is there any support in CppADCodeGen for third and higher order derivatives?

joaoleal commented 6 years ago

You can generate higher order derivatives. Unfortunately the drivers to help with the compilation and execution of the generated source-code do not support higher order derivatives yet.

Here is an example of how you can generate the source-code for a third-order derivative.

/* --------------------------------------------------------------------------
 *  CppADCodeGen: C++ Algorithmic Differentiation with Source Code Generation:
 *    Copyright (C) 2018 Joao Leal
 *
 *  CppADCodeGen is distributed under multiple licenses:
 *
 *   - Eclipse Public License Version 1.0 (EPL1), and
 *   - GNU General Public License Version 3 (GPL3).
 *
 *  EPL1 terms and conditions can be found in the file "epl-v10.txt", while
 *  terms and conditions for the GPL3 can be found in the file "gpl3.txt".
 * ----------------------------------------------------------------------------
 * Author: Joao Leal
 */
#include <iosfwd>
#include <vector>
#include <cppad/cg.hpp>

using namespace CppAD;
using namespace CppAD::cg;

int main(void) {
    // use a special object for source code generation
    using CGD = CG<double>;
    using ADCG = AD<CGD>;

    /***************************************************************************
     *                               the model
     **************************************************************************/

    // independent variable vector
    size_t n = 2;
    std::vector<ADCG> x(n);
    x[0] = 2.;
    x[1] = 3.;
    Independent(x);

    // dependent variable vector
    size_t m = 1;
    std::vector<ADCG> y(m);

    // the model
    y[0] = x[0] * x[0] * x[0] * x[1] * x[1];

    ADFun<CGD> fun(x, y); // the model tape

    /***************************************************************************
     *                        Generate the C source code
     **************************************************************************/

    /**
     * start the special steps to create the source code generation for a
     * third order derivative
     */
    CodeHandler<double> handler;

    std::vector<CGD> x0(n);
    handler.makeVariables(x0);

    // zero order forward mode using notation in forward_zero
    // use the template parameter Vector for the vector type
    std::vector<CGD> y0 = fun.Forward(0, x0);

    // first order forward mode using notation in forward_one
    // X(t)           = x0 + x1 * t
    // Y(t) = F[X(t)] = y0 + y1 * t + o(t)
    std::vector<CGD> x1 {1.0, 0.0};
    std::vector<CGD> y1 = fun.Forward(1, x1); // partial F w.r.t. x_0

    // second order forward mode using notation in forward_order
    // X(t) =           x0 + x1 * t + x2 * t^2
    // Y(t) = F[X(t)] = y0 + y1 * t + y2 * t^2 + o(t^3)
    std::vector<CGD> x2 {0.0, 0.0};
    std::vector<CGD> y2 = fun.Forward(2, x2);

    // third order forward mode using notation in forward_order
    // X(t) =           x0 + x1 * t + x2 * t^2 + x3 * t^3
    // Y(t) = F[X(t)] = y0 + y1 * t + y2 * t^2 + y3 * t^3 + o(t^4)
    std::vector<CGD> x3 {0.0, 0.0};
    std::vector<CGD> y3 = fun.Forward(3, x3);

    std::vector<CGD> dy3dx0 {(3 * 2) * y3[0]};

    LanguageC<double> langC("double");
    LangCDefaultVariableNameGenerator<double> nameGen;
    langC.setGenerateFunction("third_order");

    std::ostringstream code;
    handler.generateCode(code, langC, dy3dx0, nameGen);
    std::cout << code.str();

    /***************************************************************************
     *                        Compile the source code
     **************************************************************************/
    GccCompiler<double> compiler;
    compiler.setSourcesFolder("sources_thirdorder");
    compiler.setSaveToDiskFirst(true);

    try {
        const std::map<std::string, std::string> modelSources{{"third_order.c", code.str()}};

        compiler.compileSources(modelSources, true);
        compiler.buildDynamic("thirdorder.so");

    } catch (...) {
        compiler.cleanup();
        throw;
    }
}

This will generate a dynamic library with a function for the following source-code (some source omitted for clarity):

void third_order(double const *const * in,
                 double*const * out,
                 struct LangCAtomicFun atomicFun) {
   //independent variables
   const double* x = in[0];

   //dependent variables
   double* y = out[0];

   // auxiliary variables

   y[0] = 6. * x[1] * x[1];
}

Note that using this strategy for source-code generation will not scale well for complex models since there is no coloring being done. Coloring would help to reduce the number of times you need to call the forward/reverse mode function(s).

aespielberg commented 6 years ago

Thank you for this example! Can you briefly explain what is meant by coloring here?

joaoleal commented 6 years ago

There are several papers/books that will explain it better than me. For instance, see: https://www.cs.purdue.edu/homes/agebreme/publications//sirev.pdf

What you really need to know is that coloring algorithms can be employed to reduce the number of times you have to call fun.Forward(o, xo) or fun.Reverse(o, vo) to compute derivatives by exploiting the sparsity of the system. These functions are used (indirectly) to determine the expressions for the derivatives in CppADCodeGen.

However, the number of times fun.Forward(o, xo) or fun.Reverse(o, vo) are called does not impact the generated source-code or the runtimes of the compiled library. It only impacts the time required to generate that source.

joaoleal commented 6 years ago

There is a workaround to the problem of not being able to use the CppADCodeGen drivers: compute the expressions for the first order derivatives (Jacobian) and use that as the model.

Here is an example:

/* --------------------------------------------------------------------------
 *  CppADCodeGen: C++ Algorithmic Differentiation with Source Code Generation:
 *    Copyright (C) 2018 Joao Leal
 *
 *  CppADCodeGen is distributed under multiple licenses:
 *
 *   - Eclipse Public License Version 1.0 (EPL1), and
 *   - GNU General Public License Version 3 (GPL3).
 *
 *  EPL1 terms and conditions can be found in the file "epl-v10.txt", while
 *  terms and conditions for the GPL3 can be found in the file "gpl3.txt".
 * ----------------------------------------------------------------------------
 * Author: Joao Leal
 */
#include <vector>
#include <cppad/cg.hpp>

using namespace CppAD;
using namespace CppAD::cg;

int main(void) {
    // classes for source code generation
    using CGD = CG<double>;
    using ADCG = AD<CGD>;

    /***************************************************************************
     *                             The original model
     **************************************************************************/

    // independent variable vector
    size_t n = 2;
    std::vector<ADCG> x(n);
    x[0] = 2.;
    x[1] = 3.;
    Independent(x);

    // dependent variable vector
    size_t m = 1;
    std::vector<ADCG> y(m);

    // the model
    y[0] = x[0] * x[0] * x[0] * x[1] * x[1];

    ADFun<CGD> fun(x, y); // the model tape

    /***************************************************************************
     *                    Create a new model for the Jacobian
     **************************************************************************/

    CodeHandler<double> handler;

    // independent variable vector
    std::vector<CGD> x0(n);
    handler.makeVariables(x0);

    // compute the expressions for the Jacobian using the original model
    std::vector<CGD> jac = fun.SparseJacobian(x0);

    // the independent variables for the new model
    std::vector<ADCG> xNew(n);
    CppAD::Independent(xNew);

    // change the type from CGD (jac) to ADCG (jacNew)
    Evaluator<double, CGD, ADCG> evaluator(handler);
    std::vector<ADCG> jacNew = evaluator.evaluate(xNew, jac);

    ADFun<CGD> funJac(xNew, jacNew); // the new model tape

    /***************************************************************************
     *                       Create the dynamic library
     *                  (generates and compiles source code)
     **************************************************************************/
    // generates source code
    ModelCSourceGen<double> cgen(funJac, "model_jac");
    cgen.setCreateJacobian(true);
    cgen.setCreateHessian(true);
    ModelLibraryCSourceGen<double> libcgen(cgen);

    // compile source code
    DynamicModelLibraryProcessor<double> p(libcgen);

    GccCompiler<double> compiler;
    compiler.setSaveToDiskFirst(true); // << not really required
    auto dynamicLib = p.createDynamicLibrary(compiler);

    /***************************************************************************
     *                       Use the dynamic library
     **************************************************************************/

    auto modelJac = dynamicLib->model("model_jac");

    std::vector<double> xv {2.5, 3.5};
    std::vector<double> vv {1, 0};

    //second order derivatives of the Jacobian
    std::vector<double> third_order = modelJac->Hessian(xv, vv);

    // print out the result
    std::cout << third_order[0] << " " << third_order[1] << std::endl;
}

The output is:

73.5 105
aespielberg commented 6 years ago

Thank you!!