Closed sjgiorgi closed 3 years ago
Hmm, good question.
what are the versions of scipy & pysal?
Is there any missing data in df?
df[['my_var_z', 'perc_republican_vote_2016_z']].isnull()
If there are, then we don't address this. But... I'd expect missing values to affect the other components of the regression, so do check.
Are there any extra warnings or errors that are thrown (like, the typical RuntimeWarning
generated by numpy for divide by zero?)
If you use the lm tests directly, does this behavior occur/are any warnings raised?
from pysal.model.spreg.diagnostics_sp import LMtests
LMtests(m1, subset)
PySAL==1.13.0
and scipy==0.19.1
No missing data.
No warnings for this method. When I run mi = pysal.Moran(df['my_var'], subset)
I get
('WARNING: ', '51580', ' is an island (no neighbors)')
Additionally, I get errors when I run GM_Lag
, though I don't suspect it's connected:
m3 = pysal.spreg.GM_Lag(df[['perc_republican_president_2016_z']].values, df[['group_norm_z']].values, \
w=subset, spat_diag=True)
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
<ipython-input-30-e10554ba14df> in <module>()
----> 1 m3 = pysal.spreg.GM_Lag(df[['perc_republican_president_2016_z']].values, df[['group_norm_z']].values, w=subset, spat_diag=True)
/data/anaconda2/envs/dlatk/lib/python3.5/site-packages/pysal/spreg/twosls_sp.py in __init__(self, y, x, yend, q, w, w_lags, lag_q, robust, gwk, sig2n_k, spat_diag, vm, name_y, name_x, name_yend, name_q, name_w, name_gwk, name_ds)
476 self, y=y, x=x_constant, w=w.sparse, yend=yend2, q=q2,
477 w_lags=w_lags, robust=robust, gwk=gwk,
--> 478 lag_q=lag_q, sig2n_k=sig2n_k)
479 self.rho = self.betas[-1]
480 self.predy_e, self.e_pred, warn = sp_att(w, self.y, self.predy,
/data/anaconda2/envs/dlatk/lib/python3.5/site-packages/pysal/spreg/twosls_sp.py in __init__(self, y, x, yend, q, w, w_lags, lag_q, robust, gwk, sig2n_k)
175
176 TSLS.BaseTSLS.__init__(self, y=y, x=x, yend=yend, q=q,
--> 177 robust=robust, gwk=gwk, sig2n_k=sig2n_k)
178
179
/data/anaconda2/envs/dlatk/lib/python3.5/site-packages/pysal/spreg/twosls.py in __init__(self, y, x, yend, q, h, robust, gwk, sig2n_k)
157 self.k = z.shape[1]
158 hth = spdot(h.T, h)
--> 159 hthi = la.inv(hth)
160 zth = spdot(z.T, h)
161 hty = spdot(h.T, y)
/data/anaconda2/envs/dlatk/lib/python3.5/site-packages/numpy/linalg/linalg.py in inv(a)
511 signature = 'D->D' if isComplexType(t) else 'd->d'
512 extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 513 ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
514 return wrap(ainv.astype(result_t, copy=False))
515
/data/anaconda2/envs/dlatk/lib/python3.5/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
88
89 def _raise_linalgerror_singular(err, flag):
---> 90 raise LinAlgError("Singular matrix")
91
92 def _raise_linalgerror_nonposdef(err, flag):
LinAlgError: Singular matrix
from pysal.spreg.diagnostics_sp import LMtests # <-- slightly different import than what you suggested
a = LMtests(m1, subset)
print(a.lme, a.lml, a.rlme, a.rlml, a.sarma)
>>> ((nan, nan), (nan, nan), (nan, nan), (nan, nan), (nan, nan))
Moving over to spreg.
Really hard without being able to replicate. The fact that @sjgiorgi can run the Moran's I doesn't help much, since the MI is being run for the single variable and the LM tests consider the residuals.
The GM_Lag error may be more informative. Usually, this LinAlgError: Singular matrix
error in GM_Lag
occurs when there is a variable that mimics the weights matrix, so X and WX are perfectly collinear.
@sjgiorgi could you please try to run OLS
and GM_Lag
using your y variable on a different X and also a different y on your X?
Closed due to no response.
I'm trying to run a simple spatial regression (following the tutorial here) but I'm getting
nan
variables for the Spatial dependence diagnostics. I'm not sure why. No warnings are being printed.Any idea why I am getting nan values?
I can successfully calculate Moran's I and everything else in the
OLS
call works, so in some sense my data is setup correctly.Code:
I unfortunately cannot share the variable
my_var
so it can't be reproduced. So I tried reproducing on some open source county level data (from County Health Rankings), and everything works fine.