Closed fmurgia closed 3 years ago
Also, notice all the time spent as "kernel" (red bars) ! Why the need for all these CPU cycles in the OS?
Hi @fmurgia,
I think the multithreading has something to do with the R installation in your cluster. Then R is installed with openblas https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#OpenBLAS, it will automatically use multiple threads even for simple linear algebra operation. I've found this post. https://stackoverflow.com/questions/45794290/in-r-how-to-control-multi-threading-in-blas-parallel-matrix-product You may try to follow to control the number of threads in R.
Thanks!, Wei
Hi Wei, thanks for your help. It makes sense.
Best,
Federico
Hi Federico,
Thanks for the question! Here is a command that can be used to force the step 2 job to only use one CPU
export MKL_NUM_THREADS=1; export MKL_DYNAMIC=false; export OMP_NUM_THREADS=1; export OMP_DYNAMIC=false;
Actually, using multiple threads slows down the Step 2. Adding this command solves the problem
Best, Wei
I'm running Step2 using this code:
step2_SPAtests.R \ --bgenFile=chr6_v3.bgen \ --bgenFileIndex=chr6_v3.bge.bgi \ --minInfo=0.3 \ --minMAC=5 \ --sampleFile=chr6.sample \ --GMMATmodelFile=step1_Pheno.rda \ --varianceRatioFile=step1_Pheno.varianceRatio.txt \ --SAIGEOutputFile=Pheno.SAIGE.chr6.txt \ --numLinesOutput=20000 \ --IsOutputAFinCaseCtrl=TRUE \ --LOCO=FALSE \ --IsDropMissingDosages=TRUE
It seems that this step uses all cpu available in my server (64). Is there a way to select the number of cpu to use in step2? I'm using Saige v 0.43.3 installed using conda.