jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

Question on diamond, block and ram usage #814

Closed eperezv closed 2 months ago

eperezv commented 3 months ago

Hi, I'm running SqueezeMeta on a big dataset and I wonder whether diamond (04.rundiamond) is using resources and running as expected. My workstation has 128 Gb ram and I left block size as automatic (it was 14.6). I checked resource usage and at some point all ram was used and it started to heavily swap (sometimes even 80 Gb or more). I have reduced the block to 8 in parameters.pl, but it is still all the ram and using swap (up to 30 Gb).

Is this expected? With block set to 8, it runs but according to the blocks in the log, it could take more than a week to process. Maybe reducing the number of threads and increasing block I could make it run faster? Thank you

fpusan commented 3 months ago

How many ORFs are in your dataset? This is not the first time I see this, and it's hard to fix from our side. Default blocksize should work in a 128Gb RAM workstation (I work on one too) but we've had issues in the past in which the the same version of DIAMOND running on the same dataset would consume vastly different amounts of RAM in different computers (maybe because their pre-compiled binary has some weird interactions with different versions of the system libraries?). This got better with newer versions of DIAMOND, and I hadn't found instances of this issue in a while. I don't think lowering the number of threads will help here. Maybe you can try replacing the DIAMOND binary we ship with SqueezeMeta with the latest version and see if that fixes it. Otherwise you may need to wait.

eperezv commented 3 months ago

Thanks for your quick reply. I have ca. 18M ORFs. I will try first to replace diamond binary by the version from their github and see whether it helps or not.

fpusan commented 3 months ago

Those are a lot. A week seems like a bit too much but it's still on the realistic side of things. Let me know how it goes!

eperezv commented 3 months ago

I updated the diamond binary but same thing. By reducing the block to 7, it almost uses no swap so I will leave it run like this.

fpusan commented 2 months ago

I see no new activity so I'll close this, hope it managed to run with blocksize 7