Open rhaegar325 opened 1 month ago
Hi @rhaegar325. This is fairly unusual, I've not ever seen this much of a performance regression with these environments. There is an additional overhead of launching the initial python process due to also launching a container, but once that process is running, all subsequent processes launched by multithreading are in the container, and so shouldn't take any longer to launch than any if they were launched from any other distribution. Could you please upload your script somewhere so I can take a look?
Hi @dsroberts, thanks for your reply. It would be great if you have time to look into my code, here is the link to my script: (https://github.com/ACCESS-NRI/MED-utils/blob/main/access_med_utils/CMORise.py), more specificly, the function running in subprocess was there (https://github.com/ACCESS-NRI/MED-utils/blob/6f0693fd453f1177ffc3483398e5521fa5fd353a/access_med_utils/CMORise.py#L202), and the multiprocessing function part was there: (https://github.com/ACCESS-NRI/MED-utils/blob/6f0693fd453f1177ffc3483398e5521fa5fd353a/access_med_utils/CMORise.py#L365). hope that will help you jump to the point quicker.
after those days test, I found that even the multiprocessing.pool
create multiple processes, the cpu_time usage are never higher than wall_time. so I suspect there might be something blocked those processes, not sure it was the io
or other parts.
Hi @rhaegar325. Its hard to tell without actually running it myself, but my initial suspicion is that this line (https://github.com/ACCESS-NRI/MED-utils/blob/main/access_med_utils/CMORise.py#L418) is involved. You're running pool_process
in a loop, which is creating and destroying a multiprocessing pool for every path in s_dic.keys()
. This is a very expensive operation, You're far better off flattening the s_dic.keys()
loop into the file_set
list. Something like:
file_set = [ j for sub in [ glob.glob(non_cmip_path+path) for path in s_dic.keys() ] for j in sub ]
Then do
result = pool_process(mp_newdataset, file_set)
On the larger file_set
. This has the advantage of only creating the multiprocessing Pool
once. Once that's done, I'd be interested to see the difference between the two environments.
Really appreciate for your suggestion @dsroberts , I will have a try first.
Hi, @dsroberts, Thanks for your advise for multiprocessing.Pool, I have update this part and the new version was in branch update CMORise.py and waiting for been merged.
However, the issue was still there, I tried couple of ways, the script do generate multiple processes but it seems those processes was blocked in some part and I don't know exactly where.
Hi, @rbeucher and @Dsroberts @truth-quark:
I have some problems while using
multiprocessing
on modules inhh5
andxp65
. my code running good on my local mamba env in kj13, but when I use module inhh5
andxp65
, it's running really slow. after I test, it do running parallelly, but in each process, it pretty slow than it runs in my local env. follow are some output put of my test, make it easier to under.This is time cost when it runs in my local mamba env:
I divide the process into two part, first part is load data from
.nc
file, second part is to convert data format. the output shows the max, min and average time cost of first part, second part and the whole process.and following is the same code run on
hh5/public/modules/conda_concept/analysis3-24.01
andxp65/access-med-0.8
.it's way to slow than it should be. And I also make a test to run sequentially on
hh5
module: it works normally, so I think there might be some problem in runningmultiprocessing
, do you have any idea about this, really appreciate if you could have a look.