Open spinkney opened 3 years ago
I weirdly enjoy doing this kind of tedious doc work, so I can take a look. I’m thinking it would be best to have a dagger or something notate functions with derivates, with the default being auto diff?
All of our functions other than RNGs with real-valued returns support reverse-mode autodiff for each of their non-data
-qualified arguments. Not all of them support forward-mode (e.g., the solvers). There are three possible cases for each autodiff style.
I believe what @spinkney is asking is whether there is a reverse-mode specialization for a function (or the function it delegates to, which makes looking for things tricky).
The reverse vs. forward is also important for figuring out where we'll be able to use nested Laplace approximations as those require higher-order autodiff. Also, we'll only be able to use autodiff for Hessians if there is forward-mode support.
Two questions:
For functions
For distributions this is a bit more difficult. Though if partials
or operands_and_partials
is used then it's almost certainly calculating an analytical derivative. Maybe we can add tags in these files to make it easier?
Is it safe to assume that if one signature of a function has analytic [fwd|rev] derivatives that the vectorized versions/overloads will?
Is it safe to assume that if one signature of a function has analytic [fwd|rev] derivatives that the vectorized versions/overloads will?
I believe so.
@andrjohns has done a lot of the vectorization work and @SteveBronder may know an easy way to identify analytical derivatives.
Is it safe to assume that if one signature of a function has analytic [fwd|rev] derivatives that the vectorized versions/overloads will?
It's possible to write a specialization for foo(var, double)
but not for foo(var, var)
, but I think in cases where we wrote one, we wrote them all.
The bigger deal is that it's possible to further reduce computation for vectorizations of operations each of which has an analytic derivative. We compact into a single vari
that calls all the chain()
methods and thus avoid unnecessary virtual function calls. So not all analytic derivatives will be equally efficient.
Also, we've done things like moved from reverse-mode autodiffing our ODE solvers to using coupled systems (still not quite analytic derivatives---we use the solver) to adjoint methods (which is a proper reverse-mode type specialization).
In any case, this is just basic guidance for users so we don't want to make this too fine-grained.
analytic forward derivatives (more efficient) to analytic adjunct derivatives (most efficient).
Happy to help with this still but I'm not 100% sure I'm familiar enough with the CPP to do this alone, so I've unassigned
I'm loving the provenance in the new documentation. One thing I think that would be helpful is to know which functions have derivatives and which rely on autodiff. It helps to understand from a performance perspective and helps any devs know where some potentially easy performance improvements can be found.
I wonder if just looking in the derivative folders is enough? Not sure if it captures all the derivatives for overloaded functions but is a nice start.