Combining the concatenated (unassembled) forward and reverse reads with the assembled reads

transcript / samsa2

SAMSA pipeline, version 2.0. An open-source metatranscriptomics pipeline for analyzing microbiome data, built around DIAMOND and customizable reference databases.

GNU General Public License v3.0

53 stars 36 forks source link

Hi Mona,

In the standard workflow setup, only the concatenated reads move forward past Step 2 in the master_script.sh script. This means that any forward or reverse reads that are not merged into a single read by PEAR are discarded, and aren't used at any later point.

If you're concerned about not having enough merged reads from PEAR's output, you can add the unmerged forward reads to this file (unix cat is an easy way to do so), and use this combined merged+unmerged_forward set moving forward in Step 3 and onward.

In general, I see 45-65% of the initial reads (unmerged) that go into the pipeline get merged by PEAR. If this lowers your read count too much, it's worth considering adding the unmerged forward reads, although keep in mind that this may reduce the accuracy of annotations.

-Sam

transcript / samsa2

Combining the concatenated (unassembled) forward and reverse reads with the assembled reads #51