lrberge / fixest

Fixed-effects estimations
https://lrberge.github.io/fixest/
361 stars 59 forks source link

Feature Request: fenegbin with weights argument #443

Closed zwright2482 closed 9 months ago

zwright2482 commented 9 months ago

Fixest is a tremendous package--thank you for the effort to put it together!

From what I can tell, weights (e.g., from PSM or similar) cannot currently be included in negative binomial regression using "fenegbin". Is there a reason that this is not currently doable in the package?

I have been able to include weights into other regressions within fixest (e.g., feols, fepois, etc.); however, when I have tried to include weights in a model using fenegbin, I get the following error message:

Warning message: fenegbin(fml = ...: weight is not a valid argument for function fenegbin.

For a more specific example, I have successfully used (i.e., yielding no warning message or errors) weights with fepois below:

pois.mod = fepois(y~x1+x2+x3+x4+x5+x6+x7 | fe1+ fe2, data=data, weight=~weights

The error message for fenegbin mentioned above comes when I try the following:

negbin.mod = fenegbin(y~x1+x2+x3+x4+x5+x6+x7 | fe1+ fe2, data=data, weight=~weights

Thank you in advance!

lrberge commented 9 months ago

Hi, and thanks for the words! :-)

Simple answer: no it's not implemented and will not be implemented.

Why. fenegbin is based on likelihood maximization and not GLM. When I conceived the code I didn't think about the weights. The code is uber yucky and adding weights to it is non trivial.

Possible workaround?. You can tweak the function MASS::glm.nb to make it work.

The following example works without fixed-effects:

library(fixest)
base = setNames(iris, c("y", "x1", "x2", "x3", "species"))

negbin_fitter = function(x, y, weights, family, etastart, ...){
    res = feglm.fit(X = x, y = y, weights = weights, family = family, etastart = etastart)
    res$df.residual = degrees_freedom_iid(m, "resid")
    res
}

MASS::glm.nb(y ~ x1, base, method = "negbin_fitter")
#> Call:  MASS::glm.nb(formula = y ~ x1, data = base, method = "negbin_fitter", 
#>     init.theta = 891512.9146, link = log)
#> 
#> Coefficients:
#> (Intercept)           x1
#>     1.88234     -0.03833
#> 
#> Degrees of Freedom: Total (i.e. Null);  148 Residual
#> Error in signif(x$null.deviance, digits) : 
#>   non-numeric argument to mathematical function
#> In addition: Warning messages:
#> 1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace >  :
#>   iteration limit reached
#> 2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace >  :
#>   iteration limit reached

MASS::glm.nb(y ~ x1, base)
#> Call:  MASS::glm.nb(formula = y ~ x1, data = base, init.theta = 891457.2513,
#>     link = log)
#> 
#> Coefficients:
#> (Intercept)           x1
#>     1.88234     -0.03833
#> 
#> Degrees of Freedom: 149 Total (i.e. Null);  148 Residual
#> Null Deviance:      17.36
#> Residual Deviance: 17.12        AIC: 566.5
#> There were 50 or more warnings (use warnings() to see the first 50)

To make it work with fixed-effects, glm.nb needs to be changed because by construction it cannot pass FEs (see here).

Note that these solution may entail more work to get the SE right. This tweaked glm.nb will be (by construction) much slower than fenegbin.