Closed rc closed 11 years ago
The error is caused by stats.vonmises._cdf() returning nans for too large b (given by the first BFGS iteration):
x = [-3.14159265 -3.10668607 -3.07177948 ..., 3.07177948 3.10668607
3.14159265]
b = 2143.11596994
loc = -1.7748961779
ipdb> stats.vonmises._cdf(x-loc, b)
array([ nan, nan, nan, ..., nan, nan, nan])
The max. value of b that does not give nan is 709:
ipdb> stats.vonmises._cdf(x-loc, 709)
array([ 4.52609671e-248, 1.40117001e-237, 3.53853284e-227, ...,
1.00000000e+000, 1.00000000e+000, 1.00000000e+000])
ipdb> stats.vonmises._cdf(x-loc, 710)
array([ nan, nan, nan, ..., nan, nan, nan])
can you check what the _size
Is in the traceback? Sounds like integer overflow.
What scipy version are you using? Did you upgrade scipy in the last few months?
I don't remember any recent changes in statsmodels master that would affect the optimization for this.
Yes, it looks like that. I am using scipy from git to have basinhopping() available. The traceback above is caused by the params of nans from an earlier call to VonMisesMixtureBinned.fit(). Then:
starting parameters: [ 2. 0. 3. 0. 0.]
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: nan
Iterations: 1
Function evaluations: 42
Gradient evaluations: 42
~/software/usr/local/lib/python/dist-packages/statsmodels/base/model.py:343: Warning: Inverting hessian failed, no bse or cov_params available
warn(warndoc, Warning)
Estimated distributions (2 components)
dist0: shape=2196596.7532, loc=1.4885, prob= nan
dist1: shape=484521.1196, loc=2.1939, prob=0.0000
> ./aorta/dist_mixtures/mixture_von_mises.py(377)rvs_mix()
376 rvs = []
--> 377 for ii in range(k_dist):
378 try:
ipdb> sizes
array([-9223372036854735711, 0])
ipdb> params
array([ 2.19659675e+06, 1.48846026e+00, 4.84521120e+05,
2.19386410e+00, 5.71582442e+05])
BTW. I have hacked VonMisesMixtureBinned.loglikeobs() (see current master) to have a workaround. It seems to work ok...
More info: the basin-hopping solver rejects the step with nans automatically - the above hack is not needed with it. So I think it is not an issue of statsmodels, but of some scipy optimization solvers not detecting nans.
... so let us close this.