transcript / samsa2

SAMSA pipeline, version 2.0. An open-source metatranscriptomics pipeline for analyzing microbiome data, built around DIAMOND and customizable reference databases.
GNU General Public License v3.0
53 stars 36 forks source link

Estimate microbial diversity with rRNA #42

Closed Shicheng-Guo closed 4 years ago

Shicheng-Guo commented 4 years ago

Dear Sam,

I remember, in metagenome research, we can use 16S rRNA to estimate metagenome diversity. I am thinking why it is not showed in the samsa2 pipeline since in the default mode, samsa2 remove all rRNAs.

Thanks.

Shicheng

transcript commented 4 years ago

Hi Shicheng,

One issue with including the 16S rRNA is that it is only able to provide a limited resolution for metagenome organism identification. With this pipeline, the goal is to identify down to genus and/or species level resolution, which is where 16S rRNA struggles.

Another challenge is that, if an individual is doing metatranscriptome research, they're likely going to run a depletion step in their wet-lab protocol. rRNA depletions don't equally target all 16S sequences, and so the 16S results will be skewed by that step.

For this reason, the SortMeRNA step removes rRNA sequences by default. You are always free to customize the pipeline to leave it out and include rRNA results, if you prefer, but it is unlikely to provide additional information when it comes to the metatranscriptome activity profiles.

Best, Sam