ababaian / serratus

Ultra-deep search for novel viruses
http://serratus.io
GNU General Public License v3.0
253 stars 33 forks source link

Read QC before assembly #102

Closed rcedgar closed 4 years ago

rcedgar commented 4 years ago

From Graham Ruby author of the PRICE assembler:

[I highly recommend using] "PriceSeqFilter": PRICE Documentation: Independent Quality Filter. Truth be told, many of the citations I get for PRICE are actually just for the use of this tool: even when the assembler isn't needed, filtering the reads like this is still useful for many genomics projects. I'll recommend settings that Joe's lab has used after I left: “-rnf 90” and “-rqf 85 0.98” [ref. Illuminating uveitis: metagenomic deep sequencing identifies common and rare pathogens]"

Links:

http://derisilab.ucsf.edu/software/price/

http://derisilab.ucsf.edu/software/price/PriceDocumentation140408/userManual.html

Suggest assembly protocol developers try this, post feedback / results / other QC ideas under this issue.

asl commented 4 years ago

Few notes:

bbduk (from bbmap package, https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/) is quite nice, it has all necessary options and knows, multi-threaded and used by JGI. Could be used for masking / removing host bits as well.

Something like:

bbduk.sh trimpolya=15 qtrim=rl trimq=10 in=<left> in2=<right> out=<out-left> out2=<out-right> threads=<N>

would certainly work.

asl commented 4 years ago

Here is the set of JGI protocols: https://www.protocols.io/view/illumina-fastq-filtering-gydbxs6

ababaian commented 4 years ago

Closed by Assembly Pipeline