Teichlab / GPfates

MIT License
19 stars 4 forks source link

Parallel #4

Closed rcannood closed 7 years ago

rcannood commented 7 years ago

Hello Teichlab!

I'm trying to run GPfates on a single thread, but the dimensionality reduction and pseudotime inference always seems to use up my whole CPU. How can I convince GPfates/GPy to only use 1 core?

Kind regards, Robrecht

vals commented 7 years ago

Hi,

I think this depends on your NumPy / Scipy installation. Depending on how the underlying linear algebra libraries are compiled, they sometimes automatically use all your cores. (BLAS, MKL, etc).

GPy, which GPfates is built on, can also enable OpenMP, but this is a custom setting which you won't have set unless you have looked up how to do that: https://github.com/SheffieldML/GPy/blob/devel/GPy/defaults.cfg

rcannood commented 7 years ago

Dear Valentine,

Thanks for the quick response. Are you suggesting I create such a file in my home directory? Would this solve my problem?

Kind regards, Robrecht

vals commented 7 years ago

Hi Robrecht,

No, creating such a file is what you would need to do to activate OpenMP. If you don't have such a file GPy will not be parallelised.

So more likely you have compiled BLAS or LAPACK or so with MKL support. These are linear algebra libraries which NumPy uses internally. According to this Stackoverflow question there is an environment variable you can use to disable this: https://stackoverflow.com/questions/17053671/python-how-do-you-stop-numpy-from-multithreading

/Valentine

rcannood commented 7 years ago

Thanks!

Strangely enough, the code appears to run quite a lot faster.

On 8 threads:

> python3 script.py
Dimensionality reduction--------------------------------------
Running L-BFGS-B (Scipy implementation) Code:
  runtime   i      f              |g|        
    00s09  0002   1.086336e+04   6.178489e+06 
    00s13  0003   7.892411e+03   1.000589e+08 
    01s19  0022  -3.099881e+02   9.746219e+04 
    02s21  0046  -5.581056e+02   1.729447e+03 
    06s32  0134  -6.825396e+02   5.172912e+02 
    14s49  0306  -7.092430e+02   5.785915e+01 
    43s32  0900  -7.173233e+02   1.399536e+00 
    50s57  1041  -7.174891e+02   3.399435e-01 
    55s78  1148  -7.174984e+02   1.081934e-02 
 01m01s92  1286  -7.175001e+02   1.541642e-03 
 01m02s32  1293  -7.175001e+02   5.559389e-04 
Runtime:  01m02s32
Optimization status: Converged

On 1 thread:

> export MKL_NUM_THREADS=1
> export NUMEXPR_NUM_THREADS=1
> export OMP_NUM_THREADS=1
> python3 script.py
Dimensionality reduction--------------------------------------
Running L-BFGS-B (Scipy implementation) Code:
  runtime   i      f              |g|        
    00s02  0008   3.724819e+03   5.036302e+06 
    00s04  0013   5.032138e+02   3.607786e+04 
    00s09  0028  -3.555146e+02   1.262350e+04 
    00s14  0045  -5.245055e+02   1.271710e+03 
    00s19  0061  -6.178002e+02   2.134814e+03 
    01s19  0390  -7.258968e+02   3.617390e+00 
    03s20  1048  -7.600852e+02   1.984754e-01 
    03s74  1225  -7.600966e+02   1.835631e-03 
Runtime:     03s74
Optimization status: Converged

Can you confirm this behaviour?

vals commented 7 years ago

Wow that's really weird!

Can you find exactly which of the multithreading options is making it slower? MKL, NUMEXPR or OMP? There's probably an issue with your installation with one of those things.

I've noted everything Scipy-related is very slow on our compute cluster compared to my PC's, maybe this is why.

/Valentine