Open mark-andrews opened 3 years ago
This one sounds fun :)
What would you prefer: 1. when std_coef is TRUE only the std coefficient is added to the results. 2. when std_coef is TRUE, we remove the unstandardised coef and its CI and replace it with the sdt coef and its CI.
How about this:
summary_lm <- function(model, ci = 0.95, vif = TRUE, std_coef = TRUE){
summary <- summary(model)
coefs <- summary$coef %>% as.data.frame() %>%
rownames_to_column("Predictors")
cis <- confint(model, level = ci) %>% as.data.frame() %>%
rownames_to_column("Predictors") %>%
rename(Lower = `2.5 %`,
Upper = `97.5 %`)
results <- left_join(coefs, cis, by = "Predictors") %>%
select(Predictors, Estimate, Lower, Upper, everything()) %>%
rename(`p value` = `Pr(>|t|)`)
if(std_coef){
results <- coef(lm.beta::lm.beta(model)) %>% as.data.frame() %>%
rownames_to_column("Predictors") %>%
rename(`Std. Estimate` = ".") %>%
left_join(results, by = "Predictors")
}
if(vif){
vifs <- c(NA, car::vif(model)) %>% as_tibble() %>%
rename(vif = value)
results <- bind_cols(results, vifs)
}
fstat <- summary$fstatistic %>% as_tibble() %>% pull(value)
p_value <- pf(fstat[1], fstat[2], fstat[3], lower.tail = FALSE)
r2 <- summary$r.squared
r2adj <- summary$adj.r.squared
loglik <- logLik(model)
aic <- AIC(model)
dev <- deviance(model)
final_summary <- list(
`ANOVA` = paste0("F(", fstat[2], ", ", fstat[3], ") = ",
round(fstat[1],2),", p = ", signif(p_value,3)),
`R-squared` = r2,
`Adjusted R-squared` = r2adj,
`Log-Likelihood` = loglik,
`AIC` = aic,
`Deviance` = dev,
`Coefficients table` = results %>%
mutate_if(is.numeric, signif, 2),
`Confidence interval` = paste0(ci*100, "%")
)
return(final_summary)
}
model <- lm(faithful ~ trustworthy + attractive + sex_dimorph, data = faithfulfaces)
summary_lm(model, ci = 0.95, vif = TRUE, std_coef = TRUE)
which gives you this
$ANOVA
[1] "F(3, 166) = 78.74, p = 9.95e-32"
$`R-squared`
[1] 0.5873017
$`Adjusted R-squared`
[1] 0.5798433
$`Log-Likelihood`
'log Lik.' -158.0997 (df=5)
$AIC
[1] 326.1994
$Deviance
[1] 63.93827
$`Coefficients table`
Predictors Std. Estimate Estimate Lower Upper Std. Error t value p value vif
1 (Intercept) 0.00 4.80 4.00 5.50 0.360 13.0 1.1e-27 NA
2 trustworthy 0.55 0.67 0.53 0.80 0.070 9.5 2.9e-17 1.4
3 attractive -0.25 -0.28 -0.44 -0.12 0.080 -3.5 6.0e-04 2.1
4 sex_dimorph -0.41 -0.40 -0.53 -0.27 0.066 -6.0 1.2e-08 1.9
$`Confidence interval`
[1] "95%"
The
summary
forlm
models is good, but a few optional extras would be nice. For example, for the coefficients table, we might likeFor the model summary, we might like
It could like this, e.g.