NorwegianVeterinaryInstitute / Talos

A shotgun metagenomic analysis pipeline using nextflow
BSD 3-Clause "New" or "Revised" License
1 stars 2 forks source link

nonpareil takes too long when using RAW data #38

Closed Thomieh73 closed 4 years ago

Thomieh73 commented 4 years ago

nonpareil takes really too long when using raw data and subsampling 1/10th of a normal sized fastq dataset of say 20 / 30 million reads. Would it be enough to set the subsampling to a 1/100th of that dataset and the repeated sampling to 1/1000 of the data.

or should I just remove this step from the first script, since it is better to calculate it from the cleaned data.

Thomieh73 commented 4 years ago

I remove the processes run_coverage and plot_coverage from the script : 01_run_quality_check.nf since these processes take too long for the raw data. Better just to do it with the cleaned data.

This can be closed