lrberge / fixest

Fixed-effects estimations
https://lrberge.github.io/fixest/
361 stars 59 forks source link

Suggestion: make `fixef.rm = 'both'` the default #496

Open statzhero opened 1 month ago

statzhero commented 1 month ago

I saw this other issue on singletons where the suggestion is to replicate Stata's reghdfe recursive singleton deletion.

As I understand it, there are good reasons to remove singletons by default. For example, see the justification why reghdfe's default is to remove them here: https://scorreia.com/research/singletons.pdf

Below is an example of what can happen to the standard errors and R2. (The difference in the example is small, though.)

library(dplyr)
library(fixest)

set.seed(56)

n <- 1000

# Create a data frame with multiple columns
df <- tibble(
  id = c(rep(1:500, each = 1), rep(501:600, each = 5)),
  x1 = rnorm(n),
  error = rnorm(n, mean = 0, sd = 1)
)

beta_1 <- 1  

df <- df |>
  mutate(y = beta_1 * x1 + error)

est <- feols(y ~ x1 | id, df)
est_sing <- feols(y ~ x1 | id, df, fixef.rm = 'both')
#NOTE: 500 fixed-effect singleton was removed (500 observations, breakup: 500).

etable(est, est_sing)
#                               est           est_sing
#Dependent Var.:                  y                  y
#                                                     
#x1              0.9065*** (0.0489) 0.9065*** (0.0491)
#Fixed-Effects:  ------------------ ------------------
#id                             Yes                Yes
#_______________ __________________ __________________
#S.E.: Clustered             by: id             by: id
#Observations                 1,000                500
#R2                         0.78738            0.57945
#Within R2                  0.47242            0.47242
#---
#Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1