mark-andrews / psyntur

Tools to help teach data analysis using R to NTU Psychology students
Other
5 stars 2 forks source link

Add custom `lm` summary function #5

Open mark-andrews opened 3 years ago

mark-andrews commented 3 years ago

The summary for lm models is good, but a few optional extras would be nice. For example, for the coefficients table, we might like

For the model summary, we might like

It could like this, e.g.

summary_lm(model, ci = 0.95, vif = TRUE, std_coef = TRUE)
jensroes commented 3 years ago

This one sounds fun :)

jensroes commented 3 years ago

What would you prefer: 1. when std_coef is TRUE only the std coefficient is added to the results. 2. when std_coef is TRUE, we remove the unstandardised coef and its CI and replace it with the sdt coef and its CI.

jensroes commented 3 years ago

How about this:

summary_lm <- function(model, ci = 0.95, vif = TRUE, std_coef = TRUE){

  summary <- summary(model)

  coefs <- summary$coef %>% as.data.frame() %>%
    rownames_to_column("Predictors")
  cis <- confint(model, level = ci) %>% as.data.frame() %>%
    rownames_to_column("Predictors") %>%
    rename(Lower = `2.5 %`,
           Upper = `97.5 %`)

  results <- left_join(coefs, cis, by = "Predictors") %>%
    select(Predictors, Estimate, Lower, Upper, everything()) %>%
    rename(`p value` = `Pr(>|t|)`)

  if(std_coef){
    results <- coef(lm.beta::lm.beta(model)) %>% as.data.frame() %>%
      rownames_to_column("Predictors") %>%
      rename(`Std. Estimate` = ".") %>%
      left_join(results, by = "Predictors") 
  }
  if(vif){
    vifs <- c(NA, car::vif(model)) %>% as_tibble() %>%
      rename(vif = value)
    results <- bind_cols(results, vifs)
  }  

  fstat <- summary$fstatistic %>% as_tibble() %>% pull(value)
  p_value <- pf(fstat[1], fstat[2], fstat[3], lower.tail = FALSE)
  r2 <- summary$r.squared
  r2adj <- summary$adj.r.squared

  loglik <- logLik(model)
  aic <- AIC(model)
  dev <- deviance(model)

  final_summary <- list(
    `ANOVA` = paste0("F(", fstat[2], ", ", fstat[3], ") = ", 
                          round(fstat[1],2),", p = ", signif(p_value,3)),
    `R-squared` = r2,
    `Adjusted R-squared` = r2adj,
    `Log-Likelihood` = loglik,
    `AIC` = aic,
    `Deviance` = dev,
    `Coefficients table` = results %>% 
      mutate_if(is.numeric, signif, 2),
    `Confidence interval` = paste0(ci*100, "%")
  )

  return(final_summary)

}

model <- lm(faithful ~ trustworthy + attractive + sex_dimorph, data = faithfulfaces)
summary_lm(model, ci = 0.95, vif = TRUE, std_coef = TRUE)

which gives you this

$ANOVA
[1] "F(3, 166) = 78.74, p = 9.95e-32"

$`R-squared`
[1] 0.5873017

$`Adjusted R-squared`
[1] 0.5798433

$`Log-Likelihood`
'log Lik.' -158.0997 (df=5)

$AIC
[1] 326.1994

$Deviance
[1] 63.93827

$`Coefficients table`
   Predictors Std. Estimate Estimate Lower Upper Std. Error t value p value vif
1 (Intercept)          0.00     4.80  4.00  5.50      0.360    13.0 1.1e-27  NA
2 trustworthy          0.55     0.67  0.53  0.80      0.070     9.5 2.9e-17 1.4
3  attractive         -0.25    -0.28 -0.44 -0.12      0.080    -3.5 6.0e-04 2.1
4 sex_dimorph         -0.41    -0.40 -0.53 -0.27      0.066    -6.0 1.2e-08 1.9

$`Confidence interval`
[1] "95%"