Arcadia-Science / sourmashconsumr

Working with the outputs of sourmash in R
https://arcadia-science.github.io/sourmashconsumr/
Other
21 stars 3 forks source link

Add a function for an alluvial plot when when a time series metagenome was sequenced #37

Closed taylorreiter closed 1 year ago

taylorreiter commented 1 year ago
taxonomy_annotate_df <- read_taxonomy_annotate(Sys.glob("~/github/2022-prjna853785-sourmash/outputs/sourmash_taxonomy/SRR*lineages*csv"))

tmp <- readr::read_csv("https://raw.githubusercontent.com/Arcadia-Science/2022-prjna853785-sourmash/main/inputs/metadata.csv") %>%
  select(query_name = run_accession, time = age_months)

plot_taxonomy_annotate_ts_alluvial(taxonomy_annotate_df, time_df = tmp, tax_glom_level = "genus")

Addresses one piece of #35

Some things that could be improved that I'll make issues for bc I don't see the point in tackling them yet:

taylorreiter commented 1 year ago

One high level question: In the example plot given there is a color for "other" genera - was this manually defined somewhere before or does the function assign genera that are below some % abundance as just "other"? A suggestion related to this is to only show the top X genera/species etc. provided by the user, such as in the ampvis2 package can provide tax_show (https://kasperskytte.github.io/ampvis2/articles/ampvis2.html) so that for complex communities this doesn't become a mess

Oooooh I had never seen ampvis2, I'll be using that as inspiration!

So how it works right now is it uses a fraction_threshold (by default, 0.01, or 1%) -- if a microbe is present in any of the time series at 1% or greater, it gets an alluvial ribbon in the plot. The user can change the fraction_threshold to anything they want it to be. Anything that does not get an alluvial ribbon gets automatically clumped into "other" via a process implemented in the function.

I like the idea of tax_show. I'll make an issue and add this as an enhancement -- that way, users can either provide a list of taxa to tax_show or use fraction_threshold.