Open DirkRemmers opened 2 years ago
Hi @DirkRemmers ,
interesting! I don't have much experience with mutiprocessing.Pool
but I would be happy to dig a bit into this as it could be relevant for my image analysis work as well. For this, it would be good to have a functional working example. Your code is really good and almost there, but I can't "just" run it unfortunately. Example data could be something like np.random.random((1024,1024, 100))
. Feel free to upload a jupyter notebook somewhere on github, that I can just clone, modify and (hopefully) send back a PR with useful feedback. see also.
Best, Robert
Hi Robert,
Thanks for the quick response! I made this repository on github for you to clone. Hopefully this is complete and clear enough to use, if not please let me know what I can change to improve it.
I went ahead and tried using another method for parallel processing called MPI4py
. This creates the 'Pool workers' (or threads / ranks as called in MPI4py
) so that they behave different than the multiprocessing
workers. Due to this difference, each thread can access the GPU, thereby allowing me to have the ideal workflow mentioned in the first post of this issue.
Although I now have a solution for my specific problem, this does not solve the core issue with multiprocessing
, so if you are still willing to look into this it would be good I think. It can turn out to be a multiprocessing
issue, but it might still be good to document this for future users that have the idea to use multiprocessing
together with pyclesperanto
.
Cheers, Dirk
@DirkRemmers
I have tried similar parallel processing with multiprocessing
pool and pathos.pool
module.
with multiprocessing it complaints with un-compatibility with Pickle
, which drives me to pathos parallel run.
Even you can start multithreading run with pathos, CLE
exhibits severe blocking and not speeding up the analysis process at all.
From you experience would mpi4py
the only solution can fix this problem? The attached link is broken and can take reference to your previous work. Would you mind to upload the code again?
@jackyko1991
Thanks for sharing your experience as well, I have not tried out pathos
myself, but good to know it does not improve the situation.
From all things I tried out, only mpi4py
has been succesfull. I'm not sure what happened to the link, but I'll try to find some time to write up another example for you. You'll see it appear here when done!
Cheers, Dirk
@jackyko1991
I had some time to get a quick example up and running. You can find the new repo here. This time I'll try to not break it 😄.
The code I now put in there is a minimal use case on how to use MPI4py. The methods are much more advanced, and the code could most likely be optimized to minimalize memory usage. If you think this method is useful, please make sure to check out the MPI4py documentation for more advanced usecases.
Cheers, Dirk
@jackyko1991
I had some time to get a quick example up and running. You can find the new repo here. This time I'll try to not break it 😄.
The code I now put in there is a minimal use case on how to use MPI4py. The methods are much more advanced, and the code could most likely be optimized to minimalize memory usage. If you think this method is useful, please make sure to check out the MPI4py documentation for more advanced usecases.
Cheers, Dirk
@DirkRemmers Great job! I have tested both openmp and and mpich on Ubuntu environment. They both worked nicely.
Should we somehow integrate the mpi workflow to the pyclesperanto as batch processing example?
Hi all,
First of all I want to say what a super nice package this is, thanks for the hard work! Second, I am not sure if this is an issue on the pyclesperanto side, or on the multiprocessing package. However, I thought it would be good to report the issue here for sure.
At the moment I'm trying to setup a high-throughput image analysis pipeline. For some of the functionality I need CPU processes, while for other processes I want to use the GPU. To speed up the CPU processes, I would like to use multiprocessing to optimize my workstation. However, I might have found an issue while trying to design a script that uses both at the same time.
When I have the GPU processes outside of the multiprocessing functionality, all works nicely. A simplified example:
However, due to the file sizes I'm going to process I am affraid this solution is not scalable since it would require to store a lot of data before processing using the GPU. The data that is required for the GPU process is relatively large, but can be removed after the GPU process is finished (I'm only interested in the result of this). After the GPU process has been performed, I am able to efficiently save my data, but saving the GPU input is not efficient and will require some unelegant data handling afterwards. Ideally, I would like to streamline my script so that I could make use of the GPU functionality within the process of a Pool worker. This way, I can prevent saving lots of data that is required to do the GPU processing. I included a lock to prevent situations where multiple workers are trying to access the GPU at the same time. A simplified example code would be:
The issue that I get with this approach though is that all pyclesperanto functionality causes the Pool worker to freeze, it cannot pass any of the cle functions. Even selecting a GPU via the
cle.select_device('RTX')
line is not working. There are no error messages for me to include in this issue, the script just seems to get stuck while trying to perform a pyclesperanto related function.Obviously the example code is simplified drastically, otherwise I could do the entire thing on the GPU. When a full script is required, please let me know and I can send all required example files.
Thanks in advance, I'm looking forward to your response.