Momi and multiprocessing

noscode commented 2 years ago

Hi,

I am running some experiments with momi and noticed strange behavior. I intended to run several processes that evaluates log-likelihood using one python script and mutiprocessing. I found out that time of log-likelihood evaluation is greater with increased number of processes in multiprocessing pool.

For example, if I evaluate log-likelihood without multiprocessing or using a pool with one process then the mean time is equal to 0.5 sec. If I create pool with two processes the mean time of one log-likelihood evaluation becomes 1.2 sec. For four processes it is 2 sec. It slows down. I tested two clusters, the previous numbers were evaluated on the cluster with 32 cores. I also used faster cluster with 96 cores and it looks like it slows down even more: 0.2 sec for 1 process, 0.6 sec for 2 processes and 1.8 for 4 processes.

I had one guess that it might be autograd package (I checked out what parts of code works slowly and finished up thinking it is autograd). I guess that autograd try to use all available cores for gradient evaluation and does take other processes into account. Then when I have several processes each of them uses all cores and they have a lot of conflicts. Am I correct? I searched something about it and found that it is an issue for torch.autograd and could be solved setting set_num_threads to desired value. But I have found that for the autograd used in momi. Do you know if there is such an option or not?

Additionally, I was interested in evaluation times of two models: 1) with model parameters and 2) without parameters (with fixed values). Actually I have not noticed the great difference between those times. But the issue with multiprocessing is for both of them. That is actually strange for me because I have no idea what autograd is doing for the model without parameters as there is no gradient to evaluate.

Any help will be appreciated. I can send you a script that I used.

Best regards, Ekaterina

jackkamm commented 2 years ago

Try setting OMP_NUM_THREADS=1 as described here:

https://momi2.readthedocs.io/en/latest/parallel.html

The multicore usage is not just autograd specific but also due to numpy and some custom OpenMP code within momi as well.

noscode commented 2 years ago

Thank you very much, I will try!

popgenmethods / momi2

Momi and multiprocessing #56