stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
744 stars 187 forks source link

constant standard vector and Eigen matrix initializers #1676

Open bob-carpenter opened 4 years ago

bob-carpenter commented 4 years ago

Description

For initialization in the language and for general use, we need functions that will initialize constant containers without allocating new vari and without placing those vari on the stack to have their chain() methods called. What we need is something like this with an explicit template parameter,

template <typename T>
std::vector<T> constant_vector(size_t N, double value);

that would be implemented someting like this:

template <typename T>
inline std::vector<T> init_std_vector(size_t n) {
  return std::vector<T>(n, std::numeric_limits<double>::quiet_Nan());
}

template <>
inline std::vector<var> init_std_vector<var>(size_t n) {
  std::vector<var> x;
  x.reserve(n);
  vari* dummy_vi = new vari(std::numeric_limits<double>::quiet_Nan(), false);
  for (size_t i = 0; i < n; ++i)
    x.emplace_back(dummy_vi);
  return x;
}

That creates a std::vector<var> with only a single vari that's not on the autodiff stack.

We want to do the same thing for Eigen::Matrix<T, R, C>.

I'm not sure how this will interact with the existing functions that have recently been added. Those do not have template parameters and default to primitive types (double and int).

I also think we can get rid of the no-chain stack as I don't think it's being used for anything, but that's a different PR.

This came up in an issue in stanc3

Example

Current Version:

v3.1.0

wds15 commented 4 years ago

BTW, the no chain stack is used in the ODE integrators, for example. The other use case is when we instantiate constants which themselves are used as roots for the AD tree (like in the gradient functional). There we need the no chain stack in order to avoid unnecessary chain calls being made.