Open bob-carpenter opened 1 year ago
The doc for grad_hessian
requires that
The functor must implement
fvar<fvar<var> > operator()(const Eigen::Matrix<fvar<fvar<var> >, Eigen::Dynamic, 1>&)
which the model_base log_prob
s currently do not. We'd need to seek upstream changes similar to what we did for the opt-in fvar Hessians in https://github.com/stan-dev/stan/pull/3144
For the others, they don't document their requirements nearly as nicely, but as far as I can tell they only require fvar<var>
(e.g. no deeper nesting of fvar
s) which should be doable at the moment on the C++ level.
I'm not sure how nicely optional symbols play with the various language interfaces. We currently don't have any functions which exist conditionally, so I'd be curious how that plays out
There are two versions of Hessian.
stan/math/mix/functor/hessian.hpp
requires fvar<var>
and does N evals for an N-dimensional density.
stan/math/fwd/functor/hessian.hpp
requires fvar<fvar<double>>
and does N^2 evals of an N-dimensional density.
We want to use approach (1). They implement the same interface, so it's just a matter of changing the include.
grad_hessian
as currently implemented doesn't seem to actually call either version of hessian
directly, so it will probably be a bit more invasive of a change than just an include. Still an upstream change, just not the one I was originally imagining (and probably an easier one than fvar<fvar<var>>
for log_prob
)
It seems to me like third-order autodiff should require that extra level of nesting? But I may be imagining this incorrectly.
Is there any reason it's not calling one of the predefined versions? The finite diff version has the same functional pattern. It should be able to take the same model functor we create for other uses.
third-order autodiff should require that extra level of nesting?
Third-order autodiff in Stan requires fvar<fvar<var>>
(N^2 reverse mode evals of second derivatives) or fvar<fvar<fvar<double>>>
(N^3 forward-mode evals of second derivatives).
We must be talking past each other:
BridgeStan currently can call either the finite diff hessian or the "version 1" (fvar<var>
) function you mention above. No issues here, I only mentioned it to say that supporting the fvar<var>
version required upstream changes
My original comment was that grad_hessian
cannot be exposed without further changes to upstream Stan functionality due to the requirement for fvar<fvar<var>>
, which sounds like it is correct
Yes, I misread "grad Hessian" as "Hessian". Maybe I'm misunderstanding what you mean by "upstream", but the grad_hessian
functor with fvar<fvar<var>>
should work for any Stan program for which the hessian
functor with fvar<var>
works.
The issue isn't that fewer models would work or anything, it's that at the moment we can only call overloads which exist in the model_base_crtp
class, and log_prob
does not yet have an overload for fvar<fvar<var>>
. It would need to be added behind an #ifdef
(this is what https://github.com/stan-dev/stan/pull/3144 did, just for fvar<var>
) in stan-dev/stan
we can only call overloads which exist in the model_base_crtp
You'd think I'd remember this given that I coded it the first time around. The templated version is defined on the actual run-time class, but I keep forgetting that the class instance is assigned to a variable typed as the base class, which only knows the virtual functions (hence no templates). It'd be easy enough to add these the same way as last time. But you want to be careful because each layer is going to greatly extend the compile time, so we only want to turn these features on when they're going to be used.
There are a bunch of autodiff functors that are implemented in Stan but not exposed yet in BridgeStan. The two most basic are already done. Most of them other than directional derivatives require forward-mode autodiff. Please feel free to add more requests to the list.
grad_Hessian
instan/math/mix
; requires forward mode)gradient_dot_vector
instan/math/mix
; most efficient in forward, can do backward)hessian_times_vector
instan/math/mix
; requires forward mode)grad_tr_mat_times_hessian
instan/math/mix
; requires forward mode)There's no inverse Hessian vector product in Stan. I'm not sure the best way to implement that---I think there are a lot of approaches because the direct way is so expensive (
vector / Hessian
in Stan).