zhanxw / rvtests

Rare variant test software for next generation sequencing data
129 stars 41 forks source link

strange behaviour of --numThreads #57

Closed dkainer closed 5 years ago

dkainer commented 6 years ago

I am running rvtest on a single node of a cluster. It works, but for some reason if I set --numThreads 16 it runs much slower than when using just --numThreads 1

With --numThreads 1 it completed 95,000 regions for one phenotype in about 140 mins With --numThreads 16 it completed 95,000 regions for one phenotype in about 570 mins

My command to run rvtest is prefaced with the 'aprun' MPI command so perhaps I need to compile rvtest with some special flags to use the multiple threads properly?

best David

zx8754 commented 5 years ago

@dkainer --numThreads is not in the manuals, is it implemented?

zhanxw commented 5 years ago

@dkainer numThread helps in certain analysis scenario. It helps most when there are large matrix computations. So 16 threads can be slower than 1 threads in certain cases. What is your use case?

zhanxw commented 5 years ago

@zx8754 numThreads uses OpenMP. @dkainer it does not use MPI.

zx8754 commented 5 years ago

@zhanxw Thank you. Just wanted to point out that I couldn't see that option (--numThreads) in the manuals. But we have --thread N option, is it the same?

zhanxw commented 5 years ago

The option --thread N is for vcf2kinship, but --numThread N is for rvtest. Sorry for the inconsistency.

zx8754 commented 5 years ago

--thread N is documented in the manuals (http://zhanxw.github.io/rvtests/) as below:

When dealing with large input files, it is often preferred to use multiple CPU to speed up calculation using the option --thread N in which N is the number of CPU.

But --numThread N is not in manuals. Could this be added to manuals?

zhanxw commented 5 years ago

Yes, I just did. Thanks.