statdivlab / corncob

Count Regression for Correlated Observations with the Beta-binomial
103 stars 22 forks source link

plot options- dotplot #147

Closed AnnaAntonatouPap closed 1 year ago

AnnaAntonatouPap commented 1 year ago

Hello,

I would like to ask maybe a rather easy question. How I could extract the appropriate info to create a dotplot rather the default plot of the package

Thanks in advance, Anna

aemann01 commented 1 year ago

I wanted to do the same thing and ended up modifying the code directly from plot_differentialTest.R --> see below

ps.dat should be changed to your phyloseq object, "V8" to your taxonomic level of interest, the df object after the loop is a dataframe with all the info needed to create your own plot.

library(ggplot2)
library(phyloseq)
library(corncob)
library(magrittr)

# glom to species level, only doing incisors/premolars for testing purposes
ps.dat.sp <- ps.dat %>%
                phyloseq::subset_samples(tooth_type == "incisor" | tooth_type == "premolar") %>%
                tax_glom("V8")
ps.dat.sp

da_analysis <- differentialTest(formula = ~ tooth_type,
                               phi.formula = ~ 1,
                               formula_null = ~ 1,
                               phi.formula_null = ~ 1,
                               test = "Wald",
                               boot = FALSE,
                               data = ps.dat.sp,
                               fdr_cutoff = 0.01)

save.image("test.R")

level <- c("V8")
x <- da_analysis

signif_taxa <- otu_to_taxonomy(x$significant_taxa, x$data, level = level)
signif_taxa <- paste0(signif_taxa, " (", x$significant_taxa, ")") # NOTE: ASV label no longer informative after glom

var_per_mod <- length(x$restrictions_DA) + length(x$restrictions_DV)
total_var_count <- length(signif_taxa) * var_per_mod
# initialize empty dataframe 
df <- as.data.frame(matrix(NA, nrow = total_var_count, ncol = 5))
colnames(df) <- c("x", "xmin", "xmax", "taxa", "variable")

qval <- stats::qnorm(.975)
restricts_mu <- attr(x$restrictions_DA, "index")
restricts_phi <- attr(x$restrictions_DV, "index")

# loop through models to populate dataframe
count <- 1
for (i in 1:length(x$significant_models)) {
  # Below from print_summary_bbdml, just to get coefficient names
  tmp <- x$significant_models[[i]]
  coefs.mu <- tmp$coefficients[1:tmp$np.mu,, drop = FALSE]
  rownames(coefs.mu) <- paste0(substring(rownames(coefs.mu), 4), " Differential Abundance")
  coefs.mu <- coefs.mu[restricts_mu,, drop = FALSE]

  coefs.phi <- tmp$coefficients[(tmp$np.mu + 1):nrow(tmp$coefficients),, drop = FALSE]
  rownames(coefs.phi) <- paste0(substring(rownames(coefs.phi), 5), " Differential Variability")
  coefs.phi <- coefs.phi[restricts_phi - tmp$np.mu,, drop = FALSE]

  coefs <- rbind(coefs.mu, coefs.phi)
  for (j in 1:var_per_mod) {
    df[count, 1:3] <- c(coefs[j, 1], coefs[j, 1] - qval * coefs[j, 2],
                      coefs[j, 1] + qval * coefs[j, 2])
    df[count, 4:5] <- c(signif_taxa[i], rownames(coefs)[j])
    count <- count + 1
    }
}

head(df)

pdf("test.pdf")
ggplot(df, aes(x = x, y = taxa)) +
  geom_vline(xintercept = 0, color = "gray50", lty = "dashed",
                          alpha = 0.75, lwd = 1) +
  geom_point() +
  geom_errorbarh(aes(xmin = xmin, xmax = xmax, colour = xmin<=0 & xmax <= 0| xmin>=0 & xmax >= 0), height = .3) +
  theme_bw() +
  facet_wrap(~variable, scales = "free_x", nrow = 1) +
  labs(title = "", x = "", y = "Taxa") +
  scale_y_discrete(limits = rev(df$taxa)) +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 5)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none") # turned off legend
dev.off()
AnnaAntonatouPap commented 1 year ago

Wow many many thanks for that

adw96 commented 1 year ago

Many thanks @aemann01 for your fantastic example. Closing this issue given that example code has been provided 🙏