Open JackLandry opened 1 year ago
Apologies in advance for the terse answer (it's not yet the time I'm back on the package). TL;DR: your comment makes sens, I'll look into it. Here's something that should work.
library(fixest)
base = setNames(iris, c("y", "x1", "x2", "w", "species"))
# make the sample 'dirty' to show it works
base$x1[4:8] = NA
est = feols(y ~ x1, base, weights = ~w)
est_no_w = feols(y ~ x1, base)
# function to compute the weighted mean
dep_mean_weighted = function(x){
# NOTA: I may create a `depvar` function which is more intuitive
# (I don't know why it does not yet exist in the `stats` package [or maybe
# it does but I don't know the function name!])
y = model.matrix(x, type = "lhs")
if(!"weights" %in% names(x)){
return(mean(y))
}
# NOTA: I'll add an argument to weights.fixest governing which sample to return
# to avoid subsetting it + the default will be to have unitary weights
w = weights(x)[obs(x)]
sum(w * y) / sum(w)
}
# extra function of interest:
obs_weighted = function(x){
if(!"weights" %in% names(x)){
return(nobs(x))
}
sum(weights(x)[obs(x)])
}
# registering it
fitstat_register("wmy", dep_mean_weighted, "Mean DV (weighted)")
fitstat_register("wobs", obs_weighted, "Observations (weighted)")
# summoning them
etable(est, est_no_w, fitstat = ~. + wobs + my + wmy)
#> est est_no_w
#> Dependent Var.: y y
#>
#> Constant 4.643*** (0.4787) 6.399*** (0.4856)
#> x1 0.5547*** (0.1610) -0.1722 (0.1580)
#> _______________________ __________________ _________________
#> S.E. type IID IID
#> Observations 145 145
#> R2 0.07668 0.00824
#> Adj. R2 0.07022 0.00131
#> Observations (weighted) 178.60 145
#> Dep. Var. mean 5.8752 5.8752
#> Mean DV (weighted) 6.2804 5.8752
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Works beautifully, thank you so much!
Currently, using the argument fitstat = c("my") in etable gives the unweighted dependent variable mean, even if the estimated regression is using weights. I think in most situations, users would want a weighted dependent variable mean if they are running regressions with weights. I don't think there is any option to do this, so it would be great if one could be added. (Really amazingly flexible package otherwise).
I realize I could hypothetically add the weighted mean as an option using fitstat_register, but by quick look at that it seems at the very least not straightforward (and maybe not even possible) to compute the weighted mean from a fixest estimation. If there is a way to add the weighted mean option using fitstat_register? And if not, is the best path forward in the near term to do some manual work with the extra_lines argument?
(Edited to add second paragraph, prematurely posted trying to add a new line)