Open steven-murray opened 8 years ago
Thanks for reporting. It doesn't look like anyone has responded. Did the comparison with the other L-BFGS implementation use the same transformations implied by the constrained parameters?
I marked this a feature and a bug, at least until someone does some more investigating.
So for the other L-BFGS implementation, I believe there is no transformation performed. In that case, it seems to evaluate first p0 which you give it, and then explicitly evaluates the edges, then works its way in (sorry I'm not too well acquainted with the exact procedure).
I would love to be more helpful in this (constructing a more minimal WE and doing a proper comparison), but I haven't really got the time at the moment, as is frustratingly often the case. I'm more than happy to trial things for you on my end though.
You can try removing the bounds from the Stan program and initializing at a value in support and see if that works better. Maybe not if the problem was boundary values.
It'd be interesting to see if the Stan model would be optimizable by the other program or if it's truly a problem with our L-BFGS implementation not being robust enough.
On Jul 6, 2016, at 9:45 PM, Steven Murray notifications@github.com wrote:
So for the other L-BFGS implementation, I believe there is no transformation performed. In that case, it seems to evaluate first p0 which you give it, and then explicitly evaluates the edges, then works its way in (sorry I'm not too well acquainted with the exact procedure).
I would love to be more helpful in this (constructing a more minimal WE and doing a proper comparison), but I haven't really got the time at the moment, as is frustratingly often the case. I'm more than happy to trial things for you on my end though.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Hi
I'm using pystan version 2.9.0. I have a model based on a truncated incomplete gamma distribution.
When running optimizing(), the results depend very heavily on which seed I choose, and it seems to have something to do with the boundary values. That is, for one seed (121010), the optimization terminates after 2 iterations with all parameters at their boundaries (there are four parameters). I was firstly getting "Convergence detected: gradient norm is below tolerance", then when I set
tol_grad=0
, I got "relative gradient magnitude is below tolerance" and then when I settol_rel_grad=0
I got "absolute change in objective function was below tolerance". I fear setting anything more to zero will mean there's nothing to terminate the iterations.When I set the algorithm to "Newton" it works (doesn't seem to hit the boundaries). When setting to "BFGS" it has the same outcome as LBFGS.
When using seed of 1234 (without any of those options specified), the result comes out nicely.
I've verified that the lp__ values are correct at several of the iterations (modulo constant).
The LBFGS method seems to work every time when using the function from scipy.
I can list my stan code:
I am using data with ~2e5 variates: samples.txt
from a distribution with Expected Values:
'logHs': 14.294849118059398, 'alpha': -1.868803779752156, 'beta': 0.6976327799172357, 'lnA': -42.58691124225525
I set the bounds as logHs -- (13.5,15.5) alpha -- (-1.99,-1.7) beta -- (0.4,1.2) lnA -- (-48,-37).
and V is 45794539.9853329.
I was previously setting the bounds to be wider on each parameter, but contracting the support has not solved the problem.
I'm hoping I've provided enough information. Unfortunately I haven't tested if this happens for simpler problems -- so I don't know if it is problem-specific, or a more general problem within Stan. Since it doesn't always happen for my own problem, I would be hesitant to try to randomly find it in a simpler problem.