alashworth / test-issue-import

0 stars 0 forks source link

transition method bugged for softabs #178

Open alashworth opened 5 years ago

alashworth commented 5 years ago

Issue by positivcheg94 Wednesday Jan 03, 2018 at 11:38 GMT Originally opened as https://github.com/stan-dev/stan/issues/2454


Summary:

transition method results in nans for softabs metric

Description:

I might use the library in a wrong way but I'll try to explain everything. Firstly, I use stan as a C++ library and my models have simple interface - method which calculates -log(prob) and method which calculates -log(prob) and gradient alongside ( hand coded first derivatives in models hierarchy ).

I actually made stan fully work with my models ( thx to templates ) and diagonal metric works nearly well ( for some cases stepsize is too high and I'm trying softabs ).

this->seed(init_sample.contparams()); this->hamiltonian.samplep(this->z, this->randint); this->hamiltonian.init(this->z, logger);

This think in general it looks wrong since for softabs metric samplep we actually need it to be fully initialized. Why hamiltonian.init is called after p sampling?

Something like this fixes it ( global_emptylogger is just a static instance of stan::callbacks::logger ) mcmc.seed(qinit); mcmc.init_hamiltonian(global_empty_logger); It forces complete hamiltonian initialization since softabs has a member Eigen::SelfAdjointEigenSolver eigen_deco which is constructed as eigen_deco(n), but before a eigen_deco.eigenvectors() call it needs to be initialized as z.eigen_deco.compute(z.hessian); ( z.hessian is initialized properly for default case - identity matrix, but decomposition is not computed.

So there are 2 solutions. 1) place decomposition into softabspoint constructor 2) swap this->hamiltonian.samplep(this->z, this->randint); this->hamiltonian.init(this->z, logger);

Second one is actually weird because code in base_[nuts/hmc/...] is duplicated ( and may need to be refactored.

Off topic:

If you have any suggestions for epsilon hacks feel free to suggest because I actually encountered that problem of huge epsilon when samplers always overstep local minimum and samples are biased. Currently, dense metric messes all up for kind of a simple model ( cannot reveal any code and model structure ). Diagonal metric with adaptive stepsize is +/- works for hmc/nuts and kind of much better for xhmc. Nearly all parameters are used as default for hmc/nuts/xhmc. + hmc is a bit tunned (before every transition call there is another mcmc_.set_nominal_stepsize_andL(mcmc.get_nominal_stepsize(), pars.steps) call because constant travel "distance" T looks bad for models except very simple.

Data for this model is: den = np.sqrt(1 + a0*2 + 2b02) a = a0/den b = b0/den y = ax + b(x2-1) + e/den where a0, b0 - floats x, e - n samples from N(0,1) ( I have a c++ python wrappers for easier testing )

Current Version:

v2.17.1