Closed taylorreiter closed 1 year ago
One high level question: In the example plot given there is a color for "other" genera - was this manually defined somewhere before or does the function assign genera that are below some % abundance as just "other"? A suggestion related to this is to only show the top X genera/species etc. provided by the user, such as in the ampvis2 package can provide
tax_show
(https://kasperskytte.github.io/ampvis2/articles/ampvis2.html) so that for complex communities this doesn't become a mess
Oooooh I had never seen ampvis2, I'll be using that as inspiration!
So how it works right now is it uses a fraction_threshold
(by default, 0.01, or 1%) -- if a microbe is present in any of the time series at 1% or greater, it gets an alluvial ribbon in the plot. The user can change the fraction_threshold
to anything they want it to be. Anything that does not get an alluvial ribbon gets automatically clumped into "other"
via a process implemented in the function.
I like the idea of tax_show
. I'll make an issue and add this as an enhancement -- that way, users can either provide a list of taxa to tax_show
or use fraction_threshold
.
Addresses one piece of #35
Some things that could be improved that I'll make issues for bc I don't see the point in tackling them yet:
n_unique_kmers
orf_unique_to_query
. I just made anif
statement to control the function and what's returned. If I add another var later, I'll make this more sophisticated so the code chunks are copy and pasted, but I think it's good enough for now, and is simpler to read right now.time_df
needs to have hard coded column names that arequery_name
andtime
. I could make this more flexible, but I documented the behavior and provided hints at runtime for the user, so I think that's good enough for now as this strategy dramatically simplifies the code.