Compilation speed for large derivatives

Hi João,

first of all, thank you for making such a great project open source! I am actively using CppADCodeGen in my research projects and it performs great!

I am mostly working with Rigid Body Dynamics, which are (fairly) large differential equations. This results in generated code for the derivatives that is easily 45k of lines. As suggested, I am using CLANG for compiling them and it makes it bearable. However, I have seen some code in CppADCodeGen that looks like you can break down the derivative code in smaller chunks, i.e. several functions?

So I was wondering whether you are dealing with equally large or larger derivatives? If so, do you have any experience in further reducing compilation time? If you have any leads, I would highly appreciate it! In return, I could contribute a little example to CppADCodeGen.

Thanks again, Michael

Hello Michael,

Glad to hear that you are using CppADCodeGen.

The source code is automatically broken down into multiple functions so that they are smaller and require less memory/time to compile. The maximum number of assignments per function/file can be customized using ModelCSourceGen::setMaxAssignmentsPerFunc(). This feature has more impact in GCC than Clang. Unfortunately, this will not get you very far to reduce the compilation time.

By setting

        ModelCSourceGen<double> cSourceInner(fun, "myName");

        // better to only choose one:
        cSourceInner.setCreateForwardOne(true); 
        cSourceInner.setCreateReverseOne(true);

        // only if you need Hessians
        cSourceInner.setCreateReverseTwo(true);

The Jacobian and Hessian evaluations are also broken into multiple functions with a small runtime performance penalty. This is sometimes needed if you use your model as an atomic function inside other models.

You can try to first use a model with a lower optimization flag:

  ClangCompiler<double> clang;
  clang.addCompileFlag("-O0");

while another version is compiled in another thread/process with more optimizations.

Alternatively, you also use CppAD while the CppADCodeGen model is compiling. You can either define your model using a template type for the active variables so that it can be traced for CppAD and CppADCodeGen or you can use an Evaluator. Evaluators allow you to convert a model defined with an active variable type into another type. See test/cppad/cg/evaluator/CppADCGEvaluatorTest.hpp.

In case you have repeated equation patterns, i.e., equations with the same structure of mathematical operations but using different independent variables, you try to define this information and CppADCodeGen will check if those equations have indeed the same structure. If equation patterns are found, CppADCodeGen produces less code which can take a lot less time to compile. Try using

ModelCSourceGen::setRelatedDependents(relatedDepCandidates);

In some situations, the equation pattern checking can also take some time, so there is a tradeoff. See example/patterns.cpp.

Alternatively, if you have smaller models that are used multiple times inside a larger one, you can create several models. Create the smaller models as usual and then use them as atomic functions inside the larger one. I currently don't have an example with this feature but you can take a look at test/cppad/cg/models/dynamic_atomic_cstr.cpp and test/cppad/cg/CppADCGDynamicAtomicTest.hpp. Of particular interest should be the method CppADCGDynamicAtomicTest::prepareAtomicLibModelBridge().

You can combine ModelCSourceGen::setRelatedDependents() with atomics.

If you have some variables which are actually parameters (their value can change but no derivative information is required), you can define custom sparsity patterns that do not compute information for these variables. You can use ModelCSourceGen::setCustomSparseJacobianElements() and ModelCSourceGen::setCustomSparseHessianElements().

I hope this helps! If you have any suggestions/ideas/questions feel free to contact me.

Best regards, João

Hi João,

thank you for the fast and detailed reply!

So far, I am mostly using a three step approach (generate derivative code -> compile -> run) rather than JIT compilation.

Some replies to your suggestions:

Since I do not necessarily need to generate derivatives at runtime, using the Evaluator is something I am leaving for the future.
So far, I was always generating the code using the CodeHandler with LanguageC. As far as I see there is no setMaxAssignmentsPerFunc() in those classes. I will try to switch to ModelCSourceGen with SaveFilesModelLibraryProcessor to make use setMaxAssignmentsPerFunc().
I was already using the sparseJacobian functionality with the CodeHandler to handle parameters or entries of the Jacobian I am not interested in. Needless to say that this helps a lot.
Since I am first generating the derivatives I am already playing around with compile flags to get the best compilation speed.
I will definitely check out ModelCSourceGen::setRelatedDependents(). This is especially interesting since the time to generate the source code is really low compared to compilation speed. And I do not mind if it increases a bit.
I will also take a look at atomic functions. This is actually a very nice feature that might come in handy in the future!

One thing that puzzles me: As far as I understand, I can use both the CodeHandler + LanguageC as well as ModelCSourceGen + SaveFilesModelLibraryProcessor for source code generation. Is there any benefit of using one over the other? It seems the ModelCSourceGen generates a bit more self-contained code, i.e. full functions and has advanced features like multithreading support. On the other hand the CodeHandler generates quite "bare" code, allowing for better integration with existing code? Is this the general idea or am I missing something?

I will post some feedback once I have tried your suggested changes.

Thank you again!! Michael

ModelLibraryCSourceGen, ModelCSourceGen, and DynamicModelLibraryProcessor are useful classes to generate source code for the model, Jacobian, and Hessian and compile it into a library. ModelCSourceGen uses CodeHandler internally to generate its source code. The classes used to generate/load the libraries require functions that are system dependent while CodeHandler does not. Currently, there is only support for Linux (the files in cppad/cg/model/system/ have the system dependent functions). See the example example/dynamic_linux.cpp.

User defined functions can be added using ModelLibraryCSourceGen::addCustomFunctionSource(). After compilation, these functions can accessed with FunctorModelLibrary::loadFunction(). In the example dynamic_linux.cpp it allows writting:

const char* (*userFn)(); // this declaration depends on the function signature defined by the user
*(void **) (&userFn) = dynamicLib->loadFunction("myCustomFunctionName");
const char* text = (*userFn)();

Best regards, João

Great, I just had a look at the internals of ModelCSourceGen and how it is making use of the CodeHandler and LanguageC. This helps a lot and explains the relation between the classes.

In fact, looking at ModelCSourceGen, there is a whole new dimension I have not yet exploited yet :)! Like generating forwardZero or re-using forwardOne for Jacobian computations...

Thanks again! Michael

joaoleal / CppADCodeGen

Compilation speed for large derivatives #3