geraldinepascal / FROGS

FROGS is a galaxy/CLI workflow designed to produce an OTUs count matrix from high depth sequencing amplicon data.
GNU General Public License v3.0
24 stars 22 forks source link

"Normalisation by random resampling" running time #67

Closed Joebio closed 1 year ago

Joebio commented 1 year ago

Hi team!

I'm normalising in BIOM by random sampling (after the clustering step) using normalisation.py in version 3.2.3 under 4 threads and 8Gb of RAM. It has been running for almost 48 hours, I am wondering if there is any way I can use to reduce the running time.

The command I am using is: normalisation.py -n 25659 -i clustering.biom -f clustering.fasta -b normalisation_abundance.biom -o normalisation.fasta -s normalisation.html -l normalisation.log

The biom file is representing 33 samples and the largest sample size is around 182,736 reads of V3-V4 16S.

TIA. Joe

mariabernard commented 1 year ago

Hi Joe,

Indeed this tool can take time depending on the number of reads and the number of OTU. Unfortunatly you can not do anything to improve it. For information, this tool is not parallelised so using 4 threads is useless (I guess that something to work on).

Regards

Maria

Joebio commented 1 year ago

Okay, I see. Thank you so much.

Best regards Joe