david-barnett / microViz

R package for microbiome data visualization and statistics. Uses phyloseq, vegan and the tidyverse. Docker image available.
https://david-barnett.github.io/microViz/
GNU General Public License v3.0
99 stars 10 forks source link

easy way to generate fixed palette for taxa #16

Closed david-barnett closed 1 year ago

david-barnett commented 3 years ago
david-barnett commented 3 years ago

closed in favour of project note

ChrisTrivedi commented 2 years ago

Hi @david-barnett ,

First off, I love the microViz package so thanks a ton for sharing with the community!

I'm posting on this closed issue because I think it might be exactly the problem I am having. Please let me know if you'd prefer this to be its own issue and I'd be happy to create one.

I have two datasets that I'm processing the same way but in different R projects. The end goal is to compare the bar plots between the two datasets. The issue is that taxonomy is called separately and they have slight differences when it comes to the most abundant taxa, so when using comp_barplot with the same n value I get different colors for the same taxa across the datasets.

image image

The last thing I want to do is have to modify the colors manually in Inkscape, so I've tried a number of different ways around this like creating a custom palette that I can use with both, but the results are ugly...to say the least (because the palette has to be large enough to capture all the ASVs)! I've accomplished this so far purely by replacing my newly made palette in place of distinct_palette(n=15, "brewerPlus") in the palette flag.

image image

If there is an easier way to do this to keep the aesthetic of your brewerPlus palette that would be ideal. Any help would be greatly appreciated. Thanks in advance!

david-barnett commented 2 years ago

Hi Chris, glad you are finding microViz useful! 😃

For any straightforward solution I could think of, you'll first need to put all your samples into one phyloseq (if not already done).

In some cases, you could use the group_by argument, which will make a list of plots, keeping the ordering and colours the same. With this you can also still use facets to group bars within each plot (seems like your plots have facets currently).

library(microViz)
library(ggplot2)
data(dietswap, package = "microbiome")

dietswap %>%
  ps_filter(timepoint == 1, sex == "male") %>%
  comp_barplot(
    tax_level = "Family", n_taxa = 15,
    bar_outline_colour = NA, bar_width = 0.7, label = NULL,
    group_by = "nationality", facet_by = "bmi_group"
  )

image image

But if you want to separate the ordering of the taxa from the palette assignment itself and/or have different legend levels per community, I thought that would be a little trickier, but it turns out it's not too bad.

I've defined a couple of functions below that I'll probably add to microViz soonish.

# this makes a named palette vector from your combined dataset (considering overall abundance to assign colours)
tax_palette <- function(data, # phyloseq or ps_extra 
                        rank, # e.g. "Genus"
                        n, # n colours / taxa not including other
                        by = sum, # method for tax_sort
                        pal = "brewerPlus", # palette name from distinct_palette
                        add = c(other = "lightgrey"), # name = value pairs appended to end of output
                        ... # other args passed to tax_sort
                        ) {
  taxa <- tax_top(data = data, rank = rank, n = n, by = by, ...)
  taxColours <- distinct_palette(n = n, pal = pal, add = NA)

  names(taxColours) <- taxa
  taxColours <- c(taxColours, add)
  return(taxColours)
}

# palette viewer function (unnecessary but maybe handy)
tax_palette_plot <- function(
  named_pal_vec # named vector of colours
) {
  stopifnot(!anyNA(names(named_pal_vec))) # all colours need names
  p <- 
    named_pal_vec %>% 
    as.data.frame() %>% 
    tibble::rownames_to_column("taxon") %>% 
    dplyr::mutate(taxon = factor(taxon, levels = rev(taxon))) %>% 
    dplyr::rename("hex" = ".") %>% 
    ggplot2::ggplot(ggplot2::aes(y = taxon, fill = hex, x = "")) +
    ggplot2::geom_raster() +
    ggplot2::scale_fill_identity() + 
    ggplot2::labs(x = NULL, y = NULL) + 
    ggplot2::theme_minimal()
  return(p)
}
myPal <- tax_palette(dietswap, rank = "Genus", pal = "brewerPlus", n = 40)
myPal %>% tax_palette_plot() # just to check the palette

image

dietswap %>%
  ps_filter(nationality == "AFR", timepoint == 1, sex == "male") %>% # small subset of data
  comp_barplot(
    tax_level = "Genus", n_taxa = 15,
    bar_outline_colour = NA, bar_width = 0.7,
    palette = myPal, label = NULL
  ) 

image

dietswap %>%
  ps_filter(nationality != "AFR", timepoint == 1, sex == "male") %>% # different subset of data
  comp_barplot(
    tax_level = "Genus", n_taxa = 15,
    bar_outline_colour = NA, bar_width = 0.7,
    palette = myPal, label = NULL
  ) 

image

ChrisTrivedi commented 2 years ago

@david-barnett This is great, thanks so much for the info. My apologies for the delay in response. I'm trying your workaround now and will report back how it works. Thanks again!

david-barnett commented 1 year ago

included since 0.9.1 https://david-barnett.github.io/microViz/news/index.html#microviz-091

ChrisTrivedi commented 1 year ago

included since 0.9.1 https://david-barnett.github.io/microViz/news/index.html#microviz-091

@david-barnett my apologies for never getting back to you on how this worked for me. It was a resounding success and I want to thank you again. Also, cheers for adding it in the new update. To show my appreciation I also gave you a nod in the acknowledgments in our recent pub that used microViz - I hope that's okay with you. Thanks again!

david-barnett commented 1 year ago

thanks Chris! glad it worked for you and I appreciate the acknowledgement :)