Open rcragun opened 1 year ago
If you specify clusters for lh_robust, the confidence intervals (CIs) and p in $lh are inconsistent with those in $lm_robust.
clusters
lh_robust
$lh
$lm_robust
The problem can be seen by using a hypothesis that one coefficient equals 0.
Simple example data:
library(estimatr) nSize = 12 dat = data.frame( x = rnorm(nSize), e = rnorm(nSize), # Irrelevant clusters for errors eg = sample(2, nSize, replace=T) ) dat$z = dat$x + dat$e
CIs match when not correcting for error correlation:
> lh_robust(z~x, data=dat, se_type='HC2', linear_hypothesis='x=0') $lm_robust Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF (Intercept) -0.137880 0.2850087 -0.4837747 0.63896458 -0.77291900 0.497159 10 x 0.620707 0.3135789 1.9794287 0.07594477 -0.07799024 1.319404 10 $lh Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF x=0 0.6207 0.3136 1.979 0.07594 -0.07799 1.319 10
CIs don't match when correcting for error correlation:
> lh_robust(z~x, data=dat, clusters=eg, se_type='stata', linear_hypothesis='x=0') $lm_robust Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF (Intercept) -0.137880 0.4824367 -0.2857991 0.8227790 -6.267819 5.992059 1 x 0.620707 0.4538092 1.3677710 0.4019017 -5.145485 6.386899 1 $lh Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF x=0 0.6207 0.4538 1.368 0.2013 -0.3904 1.632 10
Using other se_types does not alter these facts.
se_types
The problem may be due to a difference in degrees of freedom used, so I am unsure if this is the same issue as https://github.com/DeclareDesign/estimatr/issues/289.
> sessionInfo() R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22621) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] estimatr_1.0.0 loaded via a namespace (and not attached): [1] httr_1.4.5 compiler_4.1.2 R6_2.5.1 cli_3.6.0 generics_0.1.3 tools_4.1.2 [7] abind_1.4-5 rstudioapi_0.14 car_3.1-2 Rcpp_1.0.9 carData_3.0-5 mvtnorm_1.1-3 [13] texreg_1.38.6 Formula_1.2-5 rlang_1.1.0
Overview
If you specify
clusters
forlh_robust
, the confidence intervals (CIs) and p in$lh
are inconsistent with those in$lm_robust
.Reproduce
The problem can be seen by using a hypothesis that one coefficient equals 0.
Simple example data:
CIs match when not correcting for error correlation:
CIs don't match when correcting for error correlation:
Using other
se_types
does not alter these facts.Additional notes
The problem may be due to a difference in degrees of freedom used, so I am unsure if this is the same issue as https://github.com/DeclareDesign/estimatr/issues/289.
System info