Closed s3alfisc closed 1 month ago
Relatedly: with hetero errors, changing the cluster.adj
from TRUE
to FALSE
leads to a change in standard errors, which I believe should not be the case? Likely related to this line adj = ifelse(ssc$cluster.adj, n/(n - 1), 1)
in fixest:::vcov_hetero_internal
?
fit1 = feols(
proposition_vote ~ treatment,
weights = ~weights,
data = voters,
vcov = "hetero",
ssc = ssc(adj = TRUE, cluster.adj = TRUE)
)
fit2 = feols(
proposition_vote ~ treatment,
weights = ~weights,
data = voters,
vcov = "hetero",
ssc = ssc(adj = TRUE, cluster.adj = FALSE)
)
etable(fit1, fit2)
# Dependent Var.: proposition_vote proposition_vote
#
# Constant 0.8746*** (0.0308) 0.8746*** (0.0308)
# treatment 0.1131*** (0.0317) 0.1131*** (0.0316)
# _______________ __________________ __________________
# S.E. type Heteroskedas.-rob. Heteroskedas.-rob.
# Observations 300 300
# R2 0.04762 0.04762
# Adj. R2 0.04443 0.04443
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
se(fit1)
# > se(fit1)
# (Intercept) treatment
# 0.03084960 0.03167428
se(fit2)
# > se(fit2)
# (Intercept) treatment
# 0.03079814 0.03162145
We also have that
vcov(fit1) / vcov(fit2)
N = nobs(fit1)
N / (N-1)
Here are the dof corrections in comparison to sandwich
:
library(fwildclusterboot)
library(fixest)
data(voters)
head(voters)
voters = na.omit(voters)
fit = feols(
proposition_vote ~ treatment + income,
#weights = ~weights,
data = voters,
vcov = "hetero",
ssc = ssc(adj = FALSE)
)
N = nobs(fit)
k = length(coef(fit))
adj1 = N / (N-1)
adj2 = (N-1) / (N-k)
adj3 = N / (N-k)
sandwich::vcovHC(fit, type = "HC1") / sandwich::vcovHC(fit, type = "HC0")
# [1] 1.010101
adj1
# [1] 1.003344
adj2
# [1] 1.006734
N / (N-k)
# [1] 1.010101
(vcov(fit, vcov = "hetero", ssc = ssc(adj = FALSE, cluster.adj = FALSE)) ) / sandwich::vcovHC(fit, type = "HC0")
# FALSE, FALSE -> no adjustment
(vcov(fit, vcov = "hetero", ssc = ssc(adj = TRUE, cluster.adj = FALSE)) ) / sandwich::vcovHC(fit, type = "HC0")
adj2
(vcov(fit, vcov = "hetero", ssc = ssc(adj = FALSE, cluster.adj = TRUE)) ) / sandwich::vcovHC(fit, type = "HC0")
adj1
(vcov(fit, vcov = "hetero", ssc = ssc(adj = TRUE, cluster.adj = TRUE)) ) / sandwich::vcovHC(fit, type = "HC0")
adj3
I went back to our mail exchange from a very long time ago and it turns out that you've already answered my question @lrberge - I just did not get it properly 🤦
I went back to my mail exchange with Laurent & it turns out he already answered my question - I just misunderstood him 😅 But know I think I've got it!
My misunderstanding was that the cluster.adj argument was only relevant when clustered errors are computed. But this not the case! In fact, cluster.adj multiplies even heteroskedastic errors with a factor of G / (G-1) if it is TRUE
. Because all clusters are singletons, we have that G / (G-1) = N / (N-1).
Therefore, we have the follwing ssc correction:
adj = TRUE & cluster.adj = TRUE: multiply with (N-1) / (N-k) x N / (N-1) = N / (N-k)
adj = TRUE & cluster.adj = FALSE: (N-1) / (N-k)
adj = FALSE & cluster.adj = FALSE: 1
adj = FALSE & cluster.adj = TRUE: N / (N-1)
And this is also what fixest
implements =) Sorry!
Hi Laurent (@lrberge),
On my quest for identical standard errrors between
fixest
andpyfixest
, I have consistently had difficulties to match vcov matrices for heteroskedastic errors. The main difference seems to arise because of different small sample corrections.Contrary to what is described in the vignette on standard errors, the heteroskedastic vcov seems to get a small sample correction of N / N-1 if
adj = TRUE
.Is this documented anywhere? If not, where could I best add documentation via a PR?
Note that in the
fixest:::vcov_hetero_internal
function, the logic seems to be based oncluster.adj
flag, which I assume is internally always set toFALSE
for heteroskedastic errors as the N / (N-1) correction does not seem to be applied when settingcluster.adj = FALSE
via thefeols()
API, while it has an impact when adjustssc=ssc(adj = FALSE)
.Example
fixest:::vcov_hetero_internal
}