sgaure / lfe

Source code repository for the R package lfe on CRAN.
53 stars 18 forks source link

Variables with zero weights still count as observations #38

Open roussanoff opened 3 years ago

roussanoff commented 3 years ago

felm() includes observations with zero weights when calculating the number of observations. This leads to different degrees of freedom and, hence, calculated statistics. Here is an example:

library(lfe)
cars <- mtcars
cars[cars$carb==2,]$carb <- 0 #need to have some weights that are zero
reg_lfe <- felm(mpg~cyl, 
                weights = cars$carb,
                data = cars)
reg_stats <- lm(mpg~cyl, 
                weights = carb,
                data = cars)

nobs(reg_lfe)
nobs(reg_stats)

reg_lfe has 22 observations, and reg_stats has 32. They also have different F-statistics.

I was using felm() and comparing the reported F-stat from the first stage regression to the one computed manually using lm() and linearHypothesis(). I was getting similar, but not identical results. I realized that the reason was that I had a few observations with zero weights, which changed the F-statistic in felm(), so I think it's an important issue.