fbreitwieser / pavian

🌈 Interactive analysis of metagenomics data
https://doi.org/10.1093/bioinformatics/btz715
385 stars 75 forks source link

Customisable summary table #93

Open kdm9 opened 2 years ago

kdm9 commented 2 years ago

Hello,

Thanks for Pavian, it's very useful. But, for us plant microbiologists, it could be better still!

In the summary table, the Microbial column is a catch-all consisting of all identified reads not classified as vertebrate or artificial etc. For shotgun plant metagenomes, that unfortunately includes all plant reads, so >95% of classified reads are "Microbial", of which variable proportions are plant vs actual microbes. I'd like to be able to add a column to the left of "Microbial" for p_Streptophyta (=Plants), and have these reads subtracted from the Microbial column.

I've hacked this together for my install by changing the data.table creation code for the summary table, and I would be happy to submit my change as a PR. However, it's likely that others would prefer to add their key eukaryotic organism(s) as non-microbial, so a better approach might be to ask the user for taxa (as strings like p_Streptophyta) for the "left hand" columns, and then keep the bacterial/protist/viral/misc microbial columns as they are now (but subtract all "left hand side" columns from Microbial, and perhaps rename it "other taxa").

Best, Kevin