sgaure / lfe

Source code repository for the R package lfe on CRAN.
53 stars 18 forks source link

felm() heteroskedasticity-consistent covariance estimator #46

Open IsadoraBM opened 3 years ago

IsadoraBM commented 3 years ago

Following Torres-Reyna p. 24 plm() allows for three types of heteroskedasticity-consistent covariance estimators. Does lfe have an Arellano subtype or what is a comparable substitute?

MatthieuStigler commented 3 years ago

Interesting question! I see only one type of HC robust errors with felm! It seems default methods are different between felm and plm, and somehow surprisingly, I could not find the plm equivalent of felm's single robust method!? The closest seems to be the white1

library(lfe)
#> Loading required package: Matrix
library(plm)
#> 
#> Attaching package: 'plm'
#> The following object is masked from 'package:lfe':
#> 
#>     sargan
library(tidyverse)

data("Produc", package = "plm")
reg_plm <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
          data = Produc, index = c("state","year"))
reg_felm <- lfe::felm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp|state, data = Produc)

## Check coefs same?
all.equal(coef(reg_plm), coef(reg_felm))
#> [1] TRUE

## Check SE same without robust?
all.equal(sqrt(diag(vcov(reg_plm))),
          coef(summary(reg_felm, robust=FALSE))[,2])
#> [1] TRUE

## Compare default robust SE
sqrt(diag(vcovHC(reg_plm)))
#>  log(pcap)    log(pc)   log(emp)      unemp 
#> 0.06032622 0.06174249 0.08166523 0.00249584
coef(summary(reg_felm, robust=TRUE))[,2]
#>   log(pcap)     log(pc)    log(emp)       unemp 
#> 0.032293539 0.031525025 0.041179824 0.001129398

## Which robust method is felm using, compared to 
expand_grid(method= c("arellano", "white1", "white2"),
            type = c("HC0", "sss", "HC1", "HC2", "HC3", "HC4")) %>% 
  mutate(first_se_plm = map2_dbl(method, type, ~sqrt(vcovHC(reg_plm, method=.x, type=.y)[2,2])),
         diff_felm = abs(first_se_plm-0.031525025)) %>% 
  arrange(diff_felm)
#> # A tibble: 18 x 4
#>    method   type  first_se_plm diff_felm
#>    <chr>    <chr>        <dbl>     <dbl>
#>  1 white1   HC4         0.0312  0.000356
#>  2 white1   HC3         0.0309  0.000613
#>  3 white1   HC2         0.0307  0.000818
#>  4 white1   sss         0.0306  0.000946
#>  5 white1   HC1         0.0306  0.000946
#>  6 white1   HC0         0.0305  0.00102 
#>  7 white2   HC4         0.0268  0.00472 
#>  8 white2   HC3         0.0267  0.00479 
#>  9 white2   HC2         0.0266  0.00491 
#> 10 white2   sss         0.0266  0.00497 
#> 11 white2   HC1         0.0266  0.00497 
#> 12 white2   HC0         0.0265  0.00503 
#> 13 arellano HC0         0.0617  0.0302  
#> 14 arellano HC1         0.0619  0.0304  
#> 15 arellano HC2         0.0621  0.0306  
#> 16 arellano HC3         0.0624  0.0309  
#> 17 arellano sss         0.0625  0.0310  
#> 18 arellano HC4         0.0628  0.0312

Created on 2021-05-08 by the reprex package (v2.0.0)

tappek commented 3 years ago

Unfortunately, documentation in ?summary.felm is not very specific about the standard errors. I assume, one would need to read the code for full clarity.

?summary.felm:

The The standard errors are adjusted for the reduced degrees of freedom coming from the dummies which are implicitly present. They are also small-sample corrected.

If the robust parameter is FALSE, the returned object will contain ordinary standard errors. If the robust parameter is TRUE, clustered standard errors are reported if a cluster was specified in the call to felm; if not, heteroskedastic robust standard errors are reported.

MatthieuStigler commented 3 years ago

Yes, as Simen has been quite silent for a while, I am afraid if you want to find out precisely the correction used, you will need to read the code. Comparing the output with fixest might be an alternative approach.

If you succeed in this and have a suggestion for the documentation, let me know I will be happy to integrate that.