Explore simpler AD alternatives

kkmann commented 1 year ago

I wonder whether the choice to use TMB to enable AD for the likelihood is ideal given that we do not (as far as I am aware of) use many of the advanced TMB features (sparse matrices + OpenMP + laten variable integration). This is very much a brainstorming/discussion issue to collect feedback and ideas, feedback welcome.

If all we need is fast AD via C++, we could also consider more lightweight implementations. For instance, boost/math has an AD interface as well. Boost is available via the BH package and can then be used from Rcpp. Both are extremely well-tested and crucial to so many projects.

Here is the cpp code:

// file test.cpp
// [[Rcpp::depends(BH)]]

#include <Rcpp.h>
#include <math.h>
#include <boost/math/differentiation/autodiff.hpp>

using namespace Rcpp;

template <typename T>
T f(T const& x) { // define a function f = sin
  return sin(x);
}

// [[Rcpp::export]]
double f_prime(double x) { // compute the first derivative = cos
  constexpr unsigned Order = 1; // Highest order derivative to be calculated.
  auto const xx = boost::math::differentiation::make_fvar<double, Order>(x);
  auto const y = f(xx);
  return y.derivative(1);
  return 0;
}

And here the R code (requires both Rcpp and BH packages to be installed)

# file test.R
if (!require(Rcpp))
  install.packages("Rcpp")
if (!require(BH))
  install.packages("BH") # install boost headers R package

# compile C++ code
Rcpp::sourceCpp("test.cpp")
# gives function f_prime; f(x) = sin(x) is not exported f_prime should then be cos

x <- seq(0, 2*pi, length.out = 25)
fp1 <- sapply(x, f_prime)
fp2 <- sapply(x, cos)

all(fp1 == fp2)

ToDo:

[x] is there anything blocking the use of simpler AD implementations and requiring TMB? edit: no, just a lot of work.
[x] how risky is TMB as a dependency given its complexity? edit: not per se risky, great code basis, but not many CRAN packages use it (yet)
[ ] what would be the engineering trade-offs between TMB AD and other implementations (boost, ...)
[ ] figure out easiest way to test feasibility of switching the AD implementation

danielinteractive commented 1 year ago

Thanks @kkmann for thinking about this! This would be a lot of work. Also, it would be good if there is first another abstraction layer (something like TMB) that is making it easier to use the Boost AD framework for R-packages for this purpose (optimizing objective function and AD calculation of gradient and hessian). Since that is nothing specific to the mmrm package.

kkmann commented 1 year ago

I am not so sure why an abstraction layer is needed. We just want to define functions (with efficient matrix operations) that we can get gradients (and ideally Hessians) of.

The one limitation I see with the boost ad approach is that it relies on templated code since it used operator overloading. However, TMB uses cppAD (coin-or autodiff), would have to check to what degree the likelihood functions are already templated. If so, changing the AD framework could be quite simple (see above). I fear though, that it might be necessary to switch to a templated linear algebra library like boost/ublas.

Speed could be another issue but I doubt that would be a major problem.

Now that I am thinking about AD in C++ I am starting to appreciate the black magic Julia offers x)...

I double-checked reverse imports of TMB, not exactly widely used yet.

To make this a bit more actionable, I wonder whether there is a (C++) function that could serve as benchmark/PoC to see whether changing the AD framework is feasible at all. If so, we could put some code in design/ - I would definitely prioritize polishing the user experience and adding features at the moment. Would be good to know if it is feasible to use a different AD system though in case there are problems with the upstream TMB project.

openpharma / mmrm

Explore simpler AD alternatives #148