jasonsahl / LS-BSR

Large scale Blast Score Ratio (BSR) analysis
GNU General Public License v3.0
38 stars 17 forks source link

Inconsistent interpretation of "processors" argument #10

Closed ar0ch closed 8 years ago

ar0ch commented 8 years ago

The processors argument is interpreted to mean two different things by LS-BSR. It is both the number of threads a program should use and the number of parallel processes to use.

processors used as number of threads a program should use, e.g. blast searches:

"-num_threads", str(processors),

And as number of parallel processes to run:

num_workers=processors))

This means that if we want to cluster using 12 threads in vsearch we also wind up spawning 12 blast processes each with -num_threads set to 12 (A nominal 144 cores required). If run on a system with 12 threads available, the runtime of blast in (the current) 12 blastn x 12 threads is significantly slower than 12x (1 blastn x 12threads) because of contention.

Ideally this flag would be broken into -t (threads) and -p (processors) with -t being the per process thread count and -p taking on number of parallel processes to run.

jasonsahl commented 8 years ago

Good point. I had blast only using a single processor, but changed the behavior when I switched over to blast+. I will change back to only a single processor in the next push. Thanks for noticing this.