Closed aespielberg closed 6 years ago
You can generate higher order derivatives. Unfortunately the drivers to help with the compilation and execution of the generated source-code do not support higher order derivatives yet.
Here is an example of how you can generate the source-code for a third-order derivative.
/* --------------------------------------------------------------------------
* CppADCodeGen: C++ Algorithmic Differentiation with Source Code Generation:
* Copyright (C) 2018 Joao Leal
*
* CppADCodeGen is distributed under multiple licenses:
*
* - Eclipse Public License Version 1.0 (EPL1), and
* - GNU General Public License Version 3 (GPL3).
*
* EPL1 terms and conditions can be found in the file "epl-v10.txt", while
* terms and conditions for the GPL3 can be found in the file "gpl3.txt".
* ----------------------------------------------------------------------------
* Author: Joao Leal
*/
#include <iosfwd>
#include <vector>
#include <cppad/cg.hpp>
using namespace CppAD;
using namespace CppAD::cg;
int main(void) {
// use a special object for source code generation
using CGD = CG<double>;
using ADCG = AD<CGD>;
/***************************************************************************
* the model
**************************************************************************/
// independent variable vector
size_t n = 2;
std::vector<ADCG> x(n);
x[0] = 2.;
x[1] = 3.;
Independent(x);
// dependent variable vector
size_t m = 1;
std::vector<ADCG> y(m);
// the model
y[0] = x[0] * x[0] * x[0] * x[1] * x[1];
ADFun<CGD> fun(x, y); // the model tape
/***************************************************************************
* Generate the C source code
**************************************************************************/
/**
* start the special steps to create the source code generation for a
* third order derivative
*/
CodeHandler<double> handler;
std::vector<CGD> x0(n);
handler.makeVariables(x0);
// zero order forward mode using notation in forward_zero
// use the template parameter Vector for the vector type
std::vector<CGD> y0 = fun.Forward(0, x0);
// first order forward mode using notation in forward_one
// X(t) = x0 + x1 * t
// Y(t) = F[X(t)] = y0 + y1 * t + o(t)
std::vector<CGD> x1 {1.0, 0.0};
std::vector<CGD> y1 = fun.Forward(1, x1); // partial F w.r.t. x_0
// second order forward mode using notation in forward_order
// X(t) = x0 + x1 * t + x2 * t^2
// Y(t) = F[X(t)] = y0 + y1 * t + y2 * t^2 + o(t^3)
std::vector<CGD> x2 {0.0, 0.0};
std::vector<CGD> y2 = fun.Forward(2, x2);
// third order forward mode using notation in forward_order
// X(t) = x0 + x1 * t + x2 * t^2 + x3 * t^3
// Y(t) = F[X(t)] = y0 + y1 * t + y2 * t^2 + y3 * t^3 + o(t^4)
std::vector<CGD> x3 {0.0, 0.0};
std::vector<CGD> y3 = fun.Forward(3, x3);
std::vector<CGD> dy3dx0 {(3 * 2) * y3[0]};
LanguageC<double> langC("double");
LangCDefaultVariableNameGenerator<double> nameGen;
langC.setGenerateFunction("third_order");
std::ostringstream code;
handler.generateCode(code, langC, dy3dx0, nameGen);
std::cout << code.str();
/***************************************************************************
* Compile the source code
**************************************************************************/
GccCompiler<double> compiler;
compiler.setSourcesFolder("sources_thirdorder");
compiler.setSaveToDiskFirst(true);
try {
const std::map<std::string, std::string> modelSources{{"third_order.c", code.str()}};
compiler.compileSources(modelSources, true);
compiler.buildDynamic("thirdorder.so");
} catch (...) {
compiler.cleanup();
throw;
}
}
This will generate a dynamic library with a function for the following source-code (some source omitted for clarity):
void third_order(double const *const * in,
double*const * out,
struct LangCAtomicFun atomicFun) {
//independent variables
const double* x = in[0];
//dependent variables
double* y = out[0];
// auxiliary variables
y[0] = 6. * x[1] * x[1];
}
Note that using this strategy for source-code generation will not scale well for complex models since there is no coloring being done. Coloring would help to reduce the number of times you need to call the forward/reverse mode function(s).
Thank you for this example! Can you briefly explain what is meant by coloring here?
There are several papers/books that will explain it better than me. For instance, see: https://www.cs.purdue.edu/homes/agebreme/publications//sirev.pdf
What you really need to know is that coloring algorithms can be employed to reduce the number of times you have to call fun.Forward(o, xo)
or fun.Reverse(o, vo)
to compute derivatives by exploiting the sparsity of the system.
These functions are used (indirectly) to determine the expressions for the derivatives in CppADCodeGen.
However, the number of times fun.Forward(o, xo)
or fun.Reverse(o, vo)
are called does not impact the generated source-code or the runtimes of the compiled library.
It only impacts the time required to generate that source.
There is a workaround to the problem of not being able to use the CppADCodeGen drivers: compute the expressions for the first order derivatives (Jacobian) and use that as the model.
Here is an example:
/* --------------------------------------------------------------------------
* CppADCodeGen: C++ Algorithmic Differentiation with Source Code Generation:
* Copyright (C) 2018 Joao Leal
*
* CppADCodeGen is distributed under multiple licenses:
*
* - Eclipse Public License Version 1.0 (EPL1), and
* - GNU General Public License Version 3 (GPL3).
*
* EPL1 terms and conditions can be found in the file "epl-v10.txt", while
* terms and conditions for the GPL3 can be found in the file "gpl3.txt".
* ----------------------------------------------------------------------------
* Author: Joao Leal
*/
#include <vector>
#include <cppad/cg.hpp>
using namespace CppAD;
using namespace CppAD::cg;
int main(void) {
// classes for source code generation
using CGD = CG<double>;
using ADCG = AD<CGD>;
/***************************************************************************
* The original model
**************************************************************************/
// independent variable vector
size_t n = 2;
std::vector<ADCG> x(n);
x[0] = 2.;
x[1] = 3.;
Independent(x);
// dependent variable vector
size_t m = 1;
std::vector<ADCG> y(m);
// the model
y[0] = x[0] * x[0] * x[0] * x[1] * x[1];
ADFun<CGD> fun(x, y); // the model tape
/***************************************************************************
* Create a new model for the Jacobian
**************************************************************************/
CodeHandler<double> handler;
// independent variable vector
std::vector<CGD> x0(n);
handler.makeVariables(x0);
// compute the expressions for the Jacobian using the original model
std::vector<CGD> jac = fun.SparseJacobian(x0);
// the independent variables for the new model
std::vector<ADCG> xNew(n);
CppAD::Independent(xNew);
// change the type from CGD (jac) to ADCG (jacNew)
Evaluator<double, CGD, ADCG> evaluator(handler);
std::vector<ADCG> jacNew = evaluator.evaluate(xNew, jac);
ADFun<CGD> funJac(xNew, jacNew); // the new model tape
/***************************************************************************
* Create the dynamic library
* (generates and compiles source code)
**************************************************************************/
// generates source code
ModelCSourceGen<double> cgen(funJac, "model_jac");
cgen.setCreateJacobian(true);
cgen.setCreateHessian(true);
ModelLibraryCSourceGen<double> libcgen(cgen);
// compile source code
DynamicModelLibraryProcessor<double> p(libcgen);
GccCompiler<double> compiler;
compiler.setSaveToDiskFirst(true); // << not really required
auto dynamicLib = p.createDynamicLibrary(compiler);
/***************************************************************************
* Use the dynamic library
**************************************************************************/
auto modelJac = dynamicLib->model("model_jac");
std::vector<double> xv {2.5, 3.5};
std::vector<double> vv {1, 0};
//second order derivatives of the Jacobian
std::vector<double> third_order = modelJac->Hessian(xv, vv);
// print out the result
std::cout << third_order[0] << " " << third_order[1] << std::endl;
}
The output is:
73.5 105
Thank you!!
Is there any support in CppADCodeGen for third and higher order derivatives?