--maxNumReads set to more than 50000

leakrema commented 7 months ago

Hi, I set the parameter --maxNumReads to 200,000, but I noticed that the process has become quite slow. When I checked the log file, I observed that the pipeline is progressing, but I encountered errors at the assignTaxonomy step.

I'm wondering if it's advisable to set this parameter to more than 50,000 reads. Should I simply wait longer for processing when dealing with a larger number of reads, or is there a recommended maximum value for --maxNumReads that I should adhere to?

Thank you in advance :) .nextflow.log

the log file is attached

MaestSi commented 7 months ago

Hi, the maximum number of reads per sample you should analyse strongly depends on your computational infrastructure (amount of RAM memory available, number of CPUs, ...), the number of samples in the study, the size of the database, and the maximum running time for the analysis to complete you are ok with. Personally, I think in the large majority of cases a few tens of thousands of reads per sample are enough to capture most of the information, unless you are looking for very rare taxa. SM

leakrema commented 7 months ago

Okay, thanks a lot!

MaestSi / MetONTIIME

--maxNumReads set to more than 50000 #97