Open benwhalley opened 6 years ago
I would like this too, though I would settle for a "coefficients" object like summary.glm()
produces, e.g., summary(mediate(...))$coefficients
Hi! I agree that this would be useful. For the purposes of a recent analysis, I extracted the values and put them into a dataframe and then produced a LaTeX table using the gt
package. I'm sharing the code here in case it is useful for someone with a similar request.
Note that the function to create the dataframe is based on the print.summary.mediate
function that is part of the package (see https://github.com/kosukeimai/mediation/blob/master/R/mediate.R). Depending on the type of model you are looking for, it might have to be adapted. The packages mediation
, tidyverse
, and gt
need to be loaded for the code below.
# Function: create_mediation_df
# Turns the mediation output into a dataframe
# Based on the mediation package's print.summary.mediate function
# In case several summaries are to be combined, optionally takes a variable name
# and model name as input to distinguish these when rendering a table
# Parameters: model_object: mediation output object
# variable_name: optional parameter if the outcome variable name should be added to the df
# model_name: optional parameter if the model name be added to the df
# Returns: dataframe with mediation summary information, optionally with variable and model names
create_mediation_df <- function(model_object, variable_name = NULL, model_name = NULL) {
x <- summary(model_object)
clp <- 100 * x$conf.level
smat <- c(x$d1, x$d1.ci, x$d1.p)
smat <- rbind(smat, c(x$z0, x$z0.ci, x$z0.p))
smat <- rbind(smat, c(x$tau.coef, x$tau.ci, x$tau.p))
smat <- rbind(smat, c(x$n0, x$n0.ci, x$n0.p))
rownames(smat) <- c("ACME", "ADE", "Total Effect", "Proportion Mediated")
colnames(smat) <- c("Estimate", paste(clp, "% CI Lower", sep=""),
paste(clp, "% CI Upper", sep = ""), "p-value")
data <- as.data.frame(printCoefmat(smat, tst.ind = NULL)) %>%
rownames_to_column(var = "Measure") %>%
add_row(Measure = "N Observations", Estimate = x$nobs) %>%
add_row(Measure = "N Simulations", Estimate = x$sims) %>%
# add variable and model names if provided
mutate(Variable = variable_name,
Model = model_name)
return(data)
}
Then, to print this as a table (e.g. in LaTeX format), I used the gt
package:
# create dataframes with mediation summary information
df.med_1 <- create_mediation_df(med.out_1_baseline, "Outcome 1", "Baseline Model")
df.med_2 <- create_mediation_df(med.out_1_controlled, "Outcome 1", "Model With Controls")
df.med_3 <- create_mediation_df(med.out_2_baseline, "Outcome 2", "Baseline Model")
df.med_4 <- create_mediation_df(med.out_2_controlled, "Outcome 2", "Model With Controls")
# bind dataframes together
data <- rbind(df.med_1, df.med_2, df.med_3, df.med_4)
# create output table (in LaTeX format in this example)
table.mediation <- data %>%
# add stars for p values
mutate(" " = dplyr::case_when(`p-value` < 0.001 ~ "***",
`p-value` < 0.01 ~ "**",
`p-value` < 0.05 ~ "*",
TRUE ~ "")) %>%
# format p values
mutate(`p-value` = format.pval(`p-value`, eps = 0.001, digits = 3),
`p-value` = if_else(`p-value` == "NA", NA_character_, `p-value`)) %>%
# group data to automatically and create subheaders
group_by(Variable, Model) %>%
# turn into gt object
gt(caption = "My Caption") %>%
# format NAs that were introduced by coercion to dataframe as blanks
fmt_missing(columns = everything(), missing_text = "") %>%
# format numeric columns to feature 3 decimal places
fmt_number(columns = c("Estimate", "95% CI Lower", "95% CI Upper"),
rows = Measure %in% c("ACME", "ADE", "Total Effect", "Proportion Mediated"),
decimals = 3) %>%
# add border above number of observations
tab_style(
style = cell_borders(side = c("top"), color = "black", weight = px(1)),
locations = cells_body(columns = everything(),
rows = Measure == "N Observations")) %>%
# change formatting of group header text
tab_style(style = list(cell_text(style = "oblique", align = "center")),
locations = cells_row_groups()) %>%
# change formatting of labels
tab_style(style = list(cell_text(weight = "bold", align = "center")),
locations = cells_column_labels()) %>%
# right algin p-value
tab_style(style = list(cell_text(align = "right")),
locations = cells_body(columns = `p-value`)) %>%
# left align stars
tab_style(style = list(cell_text(align = "left")),
locations = cells_body(columns = ` `)) %>%
# add p value note at the bottom
tab_source_note(source_note = md("_*** p < 0.001, ** p < 0.01, * p < 0.05_")) %>%
# adapt font and border formatting
tab_options(
row_group.border.top.color = "black",
row_group.border.bottom.color = "black",
column_labels.border.top.color = "black",
column_labels.border.bottom.color = "black",
table_body.border.top.width = 1,
table_body.border.bottom.width = 1,
table_body.hlines.color = "white",
table.font.size = 12) %>%
opt_table_font(font = "times") %>%
# optional: turn into LaTeX code
as_latex() %>%
as.character() %>%
cat()
At present, summary(mediation(...)) produces output like this:
It would be nice if this could be supplemented by a tidy data frame, such that this would work:
To produce a data frame with these columns:
(these chosen to match those typically produced by broom::tidy)