SegataLab / viromeqc

ViromeQC is a computational tool to benchmark and quantify non-viral contamination in VLP-enrihed viromes. ViromeQC provides an enrichment score for each virome. The score is calculated with respect to the expected prokaryotic markers abundances in reference metagenomes
MIT License
15 stars 1 forks source link

Can I turn off length and quality filtering? #2

Closed ilnamkang closed 3 years ago

ilnamkang commented 3 years ago

Hi,

Can I turn off length and quality filtering?

I'd like to use viromeqc for data trimmed already by other tools such as trimmomatic. In this case, I think filtering by viromeqc may be not necessary.

Further, it seems that the filtering step takes much time compared to bowtie2 and diamond steps because filtering uses only one thread.

Thanks.

Ilnam

azufre451 commented 3 years ago

H Ilnam,

The filtering step also counts the number of reads in your input, and at the moment cannot be skipped. In case you can set --minlen and --minqual to 0 and this should retain all the reads you provide as input.

You are right, that step of ViromeQC only uses one thread, but if you have more than one sample you can parallelize the execution with: [https://www.gnu.org/software/parallel/](GNU Parallel).

In the next release we will try to provide an alternative option to skip the fastq_len_filter step (but this would require to count the reads in the input file anyways, so it would probably not improve ViromeQC's performance that much).

Best, Moreno

ilnamkang commented 3 years ago

Parallel worked very well.

Thanks for a quick reply and useful suggestion.