transition method results in nans for softabs metric
Description:
I might use the library in a wrong way but I'll try to explain everything. Firstly, I use stan as a C++ library and my models have simple interface - method which calculates -log(prob) and method which calculates -log(prob) and gradient alongside ( hand coded first derivatives in models hierarchy ).
I actually made stan fully work with my models ( thx to templates ) and diagonal metric works nearly well ( for some cases stepsize is too high and I'm trying softabs ).
This think in general it looks wrong since for softabs metric samplep we actually need it to be fully initialized. Why hamiltonian.init is called after p sampling?
Something like this fixes it ( global_emptylogger is just a static instance of stan::callbacks::logger )
mcmc.seed(qinit);
mcmc.init_hamiltonian(global_empty_logger);
It forces complete hamiltonian initialization since softabs has a member Eigen::SelfAdjointEigenSolver eigen_deco which is constructed as eigen_deco(n), but before a eigen_deco.eigenvectors() call it needs to be initialized as z.eigen_deco.compute(z.hessian); ( z.hessian is initialized properly for default case - identity matrix, but decomposition is not computed.
So there are 2 solutions.
1) place decomposition into softabspoint constructor
2) swap
this->hamiltonian.samplep(this->z, this->randint);
this->hamiltonian.init(this->z, logger);
Second one is actually weird because code in base_[nuts/hmc/...] is duplicated ( and may need to be refactored.
Off topic:
If you have any suggestions for epsilon hacks feel free to suggest because I actually encountered that problem of huge epsilon when samplers always overstep local minimum and samples are biased.
Currently, dense metric messes all up for kind of a simple model ( cannot reveal any code and model structure ).
Diagonal metric with adaptive stepsize is +/- works for hmc/nuts and kind of much better for xhmc.
Nearly all parameters are used as default for hmc/nuts/xhmc. + hmc is a bit tunned (before every transition call there is another mcmc_.set_nominal_stepsize_andL(mcmc.get_nominal_stepsize(), pars.steps) call because constant travel "distance" T looks bad for models except very simple.
Data for this model is:
den = np.sqrt(1 + a0*2 + 2b02)
a = a0/den
b = b0/den
y = ax + b(x2-1) + e/den
where
a0, b0 - floats
x, e - n samples from N(0,1)
( I have a c++ python wrappers for easier testing )
Summary:
transition method results in nans for softabs metric
Description:
I might use the library in a wrong way but I'll try to explain everything. Firstly, I use stan as a C++ library and my models have simple interface - method which calculates -log(prob) and method which calculates -log(prob) and gradient alongside ( hand coded first derivatives in models hierarchy ).
I actually made stan fully work with my models ( thx to templates ) and diagonal metric works nearly well ( for some cases stepsize is too high and I'm trying softabs ).
this->seed(init_sample.contparams()); this->hamiltonian.samplep(this->z, this->randint); this->hamiltonian.init(this->z, logger);
This think in general it looks wrong since for softabs metric samplep we actually need it to be fully initialized. Why hamiltonian.init is called after p sampling?
Something like this fixes it ( global_emptylogger is just a static instance of stan::callbacks::logger ) mcmc.seed(qinit); mcmc.init_hamiltonian(global_empty_logger); It forces complete hamiltonian initialization since softabs has a member Eigen::SelfAdjointEigenSolver eigen_deco which is constructed as eigen_deco(n), but before a eigen_deco.eigenvectors() call it needs to be initialized as z.eigen_deco.compute(z.hessian); ( z.hessian is initialized properly for default case - identity matrix, but decomposition is not computed.
So there are 2 solutions. 1) place decomposition into softabspoint constructor 2) swap this->hamiltonian.samplep(this->z, this->randint); this->hamiltonian.init(this->z, logger);
Second one is actually weird because code in base_[nuts/hmc/...] is duplicated ( and may need to be refactored.
Off topic:
If you have any suggestions for epsilon hacks feel free to suggest because I actually encountered that problem of huge epsilon when samplers always overstep local minimum and samples are biased. Currently, dense metric messes all up for kind of a simple model ( cannot reveal any code and model structure ). Diagonal metric with adaptive stepsize is +/- works for hmc/nuts and kind of much better for xhmc. Nearly all parameters are used as default for hmc/nuts/xhmc. + hmc is a bit tunned (before every transition call there is another mcmc_.set_nominal_stepsize_andL(mcmc.get_nominal_stepsize(), pars.steps) call because constant travel "distance" T looks bad for models except very simple.
Data for this model is: den = np.sqrt(1 + a0*2 + 2b02) a = a0/den b = b0/den y = ax + b(x2-1) + e/den where a0, b0 - floats x, e - n samples from N(0,1) ( I have a c++ python wrappers for easier testing )
Current Version:
v2.17.1