Closed mayer79 closed 1 month ago
Thanks for the issue and for the MRE!
Glum and Statsmodels produce the same standard errors, except for a correction for the degrees of freedom in the model. Glum uses n / (n - k - 1)
, where k
is the number of features in the model, whereas Statsmodels doesn't correct homoscedastic standard errors at all. To verify this:
import glum
import numpy as np
import statsmodels.api as sm
n = 100
rng = np.random.default_rng(0)
X = rng.standard_normal((n, 5))
y = (X[:, 0] > 0) + rng.uniform(size=n)
dof_corr = len(y) / (len(y) - X.shape[1] - 1)
glum_se = (
glum.GeneralizedLinearRegressor(alpha=0)
.fit(X, y)
.std_errors(X=X, y=y, robust=False)
)
sm_se = sm.GLM(y, sm.add_constant(X)).fit(cov_type="nonrobust").bse
np.testing.assert_array_almost_equal(glum_se, sm_se * np.sqrt(dof_corr))
The standard errors are equal up to fifteen digits. :)
You are faster than the wind, thanks! I have also added the results from R's glm()
.
If I want to match the result of statsmodels, I'd thus need to multiply the VC matrix with 1/dof_corr. Nice!
@lbittarello I am probably making a very stupid mistake, but why are these standard errors again different from statsmodels?
VC = model.covariance_matrix(X=X, y=y, store_covariance_matrix=True)
np.sqrt(VC.diagonal() / dof_corr)
# array([0.0434066, 0.0405607 , 0.03731663, 0.04683188, 0.03192106, 0.03436726])
Glum computes robust standard errors by default. You should get the same SEs with
VC = model.covariance_matrix(X=X, y=y, store_covariance_matrix=True, robust=False)
np.sqrt(VC.diagonal() / dof_corr)
Ahaaaa, thanks again!
I'm closing this issue as resolved, but feel free to reopen if you have more questions!
The variance-covariance estimates (and the standard errors derived from their diagonal) can differ by quite some margin (10-20%) from those of statsmodels, see the example below. Is this as expected?
If I am not wrong, glum uses n/(n-p-1) times the inverse of the Fisher information matrix. Not sure what statsmodels is doing, but it seems to be identical to R's
glm()
, which uses "dispersion * (X'X)^-1", where the inverse is calculated from the R part of the QR decomposition, and dispersion depends on the family. For the normal family, it is sum(residuals^2) / (n-p-1).Glum
Statsmodels
R
Versions: