broadinstitute / viral-ngs

Viral genomics analysis pipelines
Other
186 stars 66 forks source link

kneaddata for quality control? #620

Open biocyberman opened 7 years ago

biocyberman commented 7 years ago

I've used kneaddata for some time. It is quite robust with its features. I felt it is more flexible to use than current viral-ngs handlings of quality controls. So maybe viral-ngs can borrow kneaddata ideas or use it directly: https://bitbucket.org/biobakery/kneaddata/wiki/Home

yesimon commented 7 years ago

For the most part it looks similar to our workflow.

For viral assembly, bmtagger is used for depleting samples of human reads.

For metagenomics, the full kraken database will include the full human genome for classification. For bwa alignment, the database also similarly includes SILVA rRNAs and human genome. These reads can then be excluded afterwards.

biocyberman commented 7 years ago

Notable features to me:

  1. Can download supporting database by itself.
  2. Allow selection between bowtie and bmtagger for filtration.
  3. A separate trimming (trimmomatic) step and option to include/omit trimming during read filtration.

IMHO and in terms of implementation, they are one step ahead of viral-ngs on these features. Therefore I thought, about some adaptation, especially when you also see they have similar workflow.