jtlovell / GENESPACE

Other
191 stars 27 forks source link

Add an option to increase number of threads per `diamond` run #167

Open taprs opened 1 month ago

taprs commented 1 month ago

I noticed that on my HPC the diamond step of orthofinder takes forever for the octoploid species with ~100k gene models being BLASTed against itself.

My workaround was (1) manually running the exact diamond command in the orthofinder WorkingDirectory providing more --threads to it and then (2) killing the original diamond processes. Then, everything gets through very quickly and orthofinder does not notice the interception :)

But I think there might be a nicer way to provide more threads per diamond run. I haven't found one neither in GENESPACE nor in orthofinder — should I go down to orthofinder issues then?

LovellHAGSC commented 2 weeks ago

thanks for this suggestion. This is a feature that was part of an earlier GENESPACE release (to allow parallelization within blast runs, instead of only among). I removed it at v1 to reduce the chances for compute architecture conflict. You can definitely run orthofinder separately, and if you set -op, then alter the number of threads in those blast commands and run them separately. This would save substantial time when running a couple very large genomes, but will have little savings once the number of blast jobs significantly exceeds the number of cores.