bbuchfink / diamond

Accelerated BLAST compatible local sequence aligner.
GNU General Public License v3.0
992 stars 182 forks source link

--cluster-steps parameter set as 'sensitive' in clustering #770

Open AMbioinformatics opened 6 months ago

AMbioinformatics commented 6 months ago

I try to run the following comand: diamond cluster -d INPUT_FILE -o OUTPUT_FILE --cluster-steps sensitive --member-cover 80 -e 1e-05, but after a while, a memory-related error appears, even though the memory is set to 300 GB.

When I run the following command: diamond cluster -d INPUT_FILE -o OUTPUT_FILE --member-cover 80 -e 1e-05 everything is fine.

I would like to set parameter --cluster-steps as 'sensitive'. Am I doing this in the correct way? What could be the cause of the error?

bbuchfink commented 5 months ago

How big is your input file? Directly going to --sensitive is very expensive for larger files.

AMbioinformatics commented 5 months ago

@bbuchfink 12 GB

bbuchfink commented 5 months ago

I'm not sure why this would run out of memory. In any case, sensitive all-vs-all of a file that size will be expensive. Normally you would use cascaded clustering e.g. --cluster-steps faster_lin fast default sensitive. Or do you specifically intend not to use cascaded clustering?