jepusto / clubSandwich

Cluster-robust (sandwich) variance estimators with small-sample corrections
http://jepusto.github.io/clubSandwich/
47 stars 8 forks source link

Possibly incorrect finite sample adjustment for CR1S w/ multivariate model #32

Closed lukesonnet closed 6 years ago

lukesonnet commented 6 years ago

Hi!

Was benchmarking an estimator I'm working on and noticed that I believe your finite sample adjustment of the "stata" flavor may be wrong when working with multiple outcomes. Here's a MWE:

lmo <- lm(cbind(mpg, hp) ~ wt, data = mtcars)
clubSandwich::vcovCR(lmo, cluster = mtcars$cyl, type = "CR0")

# Manually build correction
J <- length(unique(mtcars$cyl))
n <- nrow(mtcars)
r <- 2

# Not equivalent
clubSandwich::vcovCR(lmo, cluster = mtcars$cyl, type = "CR1S")
clubSandwich::vcovCR(lmo, cluster = mtcars$cyl, type = "CR0") * ((J * (n - 1)) / ((J - 1) * (n - r)))

# can see here that your N is actually n * ny, where ny is the number of models
# and r is actually r*ny, where r is the rank of X
debugonce(clubSandwich:::CR1S)
clubSandwich::vcovCR(lmo, cluster = mtcars$cyl, type = "CR1S")

Now, I'm not actually sure that the correction you are using is wrong, but it does mean that the block diagonals of your vcov matrix are not the same as you would get from running the regression separately for each outcome, which I believe it should be.

jepusto commented 6 years ago

I agree that the results are not consistent across the marginal and multivariate models. But this is because the Stata-type correction is an ad hoc small sample adjustment. Because there's no particular principle behind the development of the correction, there's no clearly right way to extend it beyond linear models with univariate outcomes.

On the other hand, the CR1, CR2, and CR3 corrections do actually behave consistently across both types of model specifications:

type <- "CR2"
V_full <- clubSandwich::vcovCR(lm(cbind(mpg, hp) ~ wt, data = mtcars), cluster = mtcars$cyl, type = type)
V_mpg <- clubSandwich::vcovCR(lm(mpg ~ wt, data = mtcars), cluster = mtcars$cyl, type = type)
V_hp <- clubSandwich::vcovCR(lm(hp ~ wt, data = mtcars), cluster = mtcars$cyl, type = type)

all.equal(V_full[1:2, 1:2], V_mpg[1:2,1:2], check.attributes = FALSE)
all.equal(V_full[3:4, 3:4], V_hp[1:2,1:2], check.attributes = FALSE)

I think that provides a good reason for using CR2 over CR1S. Plus we've got simulation evidence that CR1, CR1S, and CR3 don't produce CIs with the correct coverage levels.

lukesonnet commented 6 years ago

Thanks for the quick response. I guess I just wanted to make sure this was a conscious decision on your part! Especially after I saw that the finite sample corrections applied by sandwich in say, HC1 estimation, were the same across univariate and multivariate models.

Thanks again!

lukesonnet commented 6 years ago

Oh and don't worry, we default to CR2 in our estimators :)

jepusto commented 6 years ago

Yes that's a good point about the approach in the sandwich package. That's actually a pretty recent change, and it made sense in the context of HC robust estimators---that is, since the package was focused on HC-corrections, there was a more-or-less clear right answer to how to extend to the multivariate case. Now that sandwich does cluster-robust stuff too, well....

lukesonnet commented 6 years ago

Oh wow, I didn't know sandwich::vcovCL existed. Do they implement your solution to the problem outlined in your paper w/ Tipton re: clusters and FEs intersecting?

jepusto commented 6 years ago

No, not as far as I know.

Sent from my iPhone

On Mar 6, 2018, at 4:38 PM, Luke Sonnet notifications@github.com wrote:

Oh wow, I didn't know sandwich::vcovCL existed. Do they implement your solution to the problem outlined in your paper w/ Tipton re: clusters and FEs intersecting?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.