Open thesamovar opened 9 years ago
How do you want to use multiple cores exactly?
Le lundi 27 avril 2015, Dan Goodman notifications@github.com a écrit :
In the old KlustaKwik there was some but not a huge benefit to using multiple cores because the problem was memory bandwidth limited. However, in KK2 the memory usage is reduced by orders of magnitude (especially for larger problems), so we might well see much better speed improvements to multiple processors.
There is a technical issue. As far as I know, Numba does not support multiple processors except in the vectorize decorator which is not something we can use in KK2 (and then only in the 'pro' version). I don't see any way around this. This might mean we have to stick to Cython.
@rossant https://github.com/rossant any thoughts?
— Reply to this email directly or view it on GitHub https://github.com/klusta-team/klustakwik2/issues/7.
The main one is in the E-step. We have a key loop which, for each cluster, involves iterating over all spikes. I use an OpenMP parallel for over this inner loop over spikes in the C++ version. I'd like to do the equivalent in the Python version.
Maybe we can use this feature to implement a parallel for loop with Numba?
Note to myself: to do this in Cython using OpenMP, we don't have access to the keyword that makes a copy of the variable for each thread, but we can allocate them in a list/array of variables and then access them using the thread index.
Do you think Numba will let us use multiple CPUs here?
I think it can be done but might be simpler using Cython. Am happy to switch to Numba but since everything is in Cython at the moment I'll stick with that for now. The big advantage of Numba to me would be that I wouldn't have to type all the variables explicitly, and we could mix and match arrays with different dtypes (e.g. float32, float64, int16, int32, int64). This is possible in Cython but gets complicated when you have multiple arrays each of which could have different dtypes.
OK this is done for the E-step now and it works pretty well. I'll leave it open in case we want to do the M-step too, but the E-step is most of the work.
Is it possible to set the number of threads that klustakwik will use? Right now it's using all of my physical and virtual CPUs, I'd like to be able to specify how many if possible. I'm using it through phy and have my OMP_NUM_THREADS=1. Thanks!
I'll look into this, I created a new issue #67 that you can follow if you want.
OK I fixed this. It was indeed ignoring OMP_NUM_THREADS but it was by design (long story). I've added a new parameter num_cpus
which you can set to the number of CPUs you want to use. This is now in the current git master branch.
Great. Just to make sure I understand: to use this, I add “num_cpus=12" to the klustakwik2 dictionary of my prm?
Yes, if you have the latest version of KK2.
On 15/07/2015 21:15, Chris Wilson wrote:
Great. Just to make sure I understand: I can now add “num_cpus=12" as a kk parameter to my prm file?
On Jul 15, 2015, at 4:06 PM, Dan Goodman notifications@github.com wrote:
OK I fixed this. It was indeed ignoring OMP_NUM_THREADS but it was by design (long story). I've added a new parameter num_cpus which you can set to the number of CPUs you want to use. This is now in the current git master branch.
— Reply to this email directly or view it on GitHub https://github.com/kwikteam/klustakwik2/issues/7#issuecomment-121730927.
— Reply to this email directly or view it on GitHub https://github.com/kwikteam/klustakwik2/issues/7#issuecomment-121732824.
note that others have reported a bug in phy where KK2 params were not properly taken into account -- should be fixed this week
In the old KlustaKwik there was some but not a huge benefit to using multiple cores because the problem was memory bandwidth limited. However, in KK2 the memory usage is reduced by orders of magnitude (especially for larger problems), so we might well see much better speed improvements to multiple processors.
There is a technical issue. As far as I know, Numba does not support multiple processors except in the vectorize decorator which is not something we can use in KK2 (and then only in the 'pro' version). I don't see any way around this. This might mean we have to stick to Cython.
@rossant any thoughts?