Open simonschoe opened 1 year ago
@simonschoe, pretty easy to modify the code from https://github.com/cran/car/blob/d67a2f5ed4013c8766e8d3e827dcaf6aaaf7f0fa/R/vif.R
library(fixest)
library(car)
#> Loading required package: carData
vif_feols <- function(mod) {
v <- vcov(mod)
# Ad-hoc way of getting "assign" attr from feols
vars = gsub("::(.*)$", "", colnames(model.matrix(mod)))
assign <- as.numeric(factor(vars, levels = unique(vars))) - 1
if (names(coefficients(mod)[1]) == "(Intercept)") {
v <- v[-1, -1]
assign <- assign[-1]
} else {
warning("No intercept: vifs may not be sensible.")
}
terms <- labels(terms(mod))
n.terms <- length(terms)
if (n.terms < 2) stop("model contains fewer than 2 terms")
R <- cov2cor(v)
detR <- det(R)
result <- matrix(0, n.terms, 3)
rownames(result) <- terms
colnames(result) <- c("GVIF", "Df", "GVIF^(1/(2*Df))")
for (term in 1:n.terms) {
subs <- which(assign == term)
result[term, 1] <- det(as.matrix(R[subs, subs])) *
det(as.matrix(R[-subs, -subs])) / detR
result[term, 2] <- length(subs)
}
if (all(result[, 2] == 1)) {
result <- result[, 1]
} else {
result[, 3] <- result[, 1]^(1/(2 * result[, 2]))
}
result
}
model <- lm(mpg ~ disp + hp + wt + factor(cyl), data = mtcars)
model_feols <- feols(mpg ~ disp + hp + wt + i(cyl), data = mtcars)
car::vif(model)
#> GVIF Df GVIF^(1/(2*Df))
#> disp 12.900887 1 3.591780
#> hp 3.531254 1 1.879163
#> wt 5.368271 1 2.316953
#> factor(cyl) 8.993015 2 1.731715
vif_feols(model_feols)
#> GVIF Df GVIF^(1/(2*Df))
#> disp 12.900887 1 3.591780
#> hp 3.531254 1 1.879163
#> wt 5.368271 1 2.316953
#> i(cyl) 8.993015 2 1.731715
Created on 2023-05-05 with reprex v2.0.2
@kylebutts indeed thanks for the hint. However, as soon as I include fixed effects (which absorb the intercept), car::vif()
returns the following warning and prevents a meaningful interpretation of the VIFs: Warning: No intercept: vifs may not be sensible.
This can be reproduced via:
feols(mpg ~ disp + hp + wt + i(cyl) | carb, data = mtcars) |>
vif_feols()
As a workaround, I've implemented a custom solution that first demeans all variables and computes VIFs from the demeaned data like so (where mod
is a feols
object):
get_max_vif <- \(mod) {
# fit OLS on demeaned data
ols <- lm(y ~ ., data = bind_cols(mod$X_demeaned, y = mod$y_demeaned))
# calculate VIF scores
car::vif(ols, type = "terms") |> max()
}
Is this a sensible approach?
I'm not sure! I haven't worked much with VIF, so I don't know how to think about it with demeaned data. Sorry I can't be of more help in that regard
Is there a solution to compute variance inflation factors (VIFs) from
fixest
objects? To the best of my knowledge,car::vif()
only accepts "baseR
" model objects, such aslm
orglm
.