borenstein-lab / burrito

A visualization tool for exploratory data analysis of metagenomic data
https://elbo-spice.gs.washington.edu/shiny/burrito/
GNU General Public License v3.0
36 stars 9 forks source link

Columns missing from PICRUSt output for BURRITO's functional input #13

Closed alisongomeiz closed 3 years ago

alisongomeiz commented 3 years ago

Hi there,

I am attempting to import PICRUSt outputs to fulfill the "Taxonomy-function linking method: Function attribution table" in the BURRITO web server. From the Table of functional attributions for each taxon section of your website, it says that you can integrate PICRUSt's metagenome_pipeline.py script, but the example file has more columns than I am receiving from my PICRUSt output. I am using the "legacy" formatted contribution (stratified) table as my input functional table file, which is what the PICRUSt documentation recommends for BURRITO. This is the output from running the convert_table.py script:

Gene Sample OTU OTUAbundanceInSample GeneCountPerGenome CountContributedByOTU

This is the input that BURRITO requires:

Gene Sample OTU OTUAbundanceInSample GeneCountPerGenome CountContributedByOTU ContributionPercentOfSample ContributionPercentOfAllSamples

Thus, I am missing the "ContributionPercentOfSample" and "ContributionPercentOfAllSample" columns. Is there a way that I can add these values manually or otherwise generate them to fit the input requirements?

engal commented 3 years ago

Hi,

This shouldn't be an issue, the file you have should work fine.

For a bit more detail, the example file in the BURRITO documentation is based on a more detailed version of the stratified output from the original PICRUSt. However, BURRITO only actually uses four of the provided columns, specifically the 'Gene', 'Sample', 'OTU', and 'CountContributedByOTU' columns from that table. This means that the smaller 6-column output you generated should contain the necessary information to run BURRITO.

Hope that helps!

alisongomeiz commented 3 years ago

Great, thank you for the clarification! It was processing my data just fine with the six columns, but I wasn't completely sure if the output would be correct since my input format didn't match BURRITO's documentation. Thanks again!