Closed olmozavala closed 4 years ago
Any update? I really need help here. Do you have an easy way to prove that the particles are being split as supposed to? (k-means clustering by each processor) Thank you very much in advance.
Do you have mpi4py
installed? Parcels won't ensure this is the case, even if you're running under MPI.
You could print pset.size
and see that it sums to the number of particles you expect to be spawned, and that the number of particles is distributed.
Hi @olmozavala, sorry for the delay in responding. Another way to check if you run Parcels in parallel is to check the file size of the out-put files in the temporary directory. Each core of mpirun writes in its own sub-directory in out-*/
, so if you have 2 cores you will see the subdirectories out-*/0/
and out-*/1/
.
Since each core should write only half the particles, the files in these directories should only be half the size of what they are in the single-processor run. If they're the same size as in the single-processor-run, then that's a clear sign that the mpi is not working
Note that mpi-support in Parcels is still somewhat 'under development', so it might not work as smoothly as we hope. But please let us know how you're going, we'll try to be more responsive!
Thank you very much @angus-g and @erikvansebille. mpi4py
is installed on HPC but somehow was not installed on my local machine. I tested again from scratch in my local PC and HPC, and I do see the splitting of particles happening on my machine, and not in HPC. I even added the import to mpi4py to be sure that is installed.
Here an output from HPC (I get the same warnings on my local machine, but on HPC each process uses the total number of particles 32300 and crashes when trying to save the output)
I made a simpler test case that you can find here https://github.com/olmozavala/ocean_litter/blob/master/SimplestTest.py
The result at my personal computer is as expected, particles are grouped by proc.
At HPC I have tried with different mpirun options, but I am not able to make it work. Any experience with submitting jobs to a cluster?
I wonder if maybe mpi4py
hasn't been built correctly against your HPC's MPI implementation? I'm not sure how you installed it, but there are some instructions at their documentation that show you how to be sure you're specifying the right MPI compilers, etc.
Hmm, very strange. @angus-g's suggestion of checking the mpi4py
installation is a good one. Another option could be that kmeans
is not installed on the HPC? Parcels needs kmeans for the partitioning, see https://github.com/OceanParcels/parcels/blob/master/parcels/particleset.py#L175
Another option could be that kmeans is not installed on the HPC
That should raise EnvironmentError
though, right? https://github.com/OceanParcels/parcels/blob/master/parcels/particleset.py#L27-L32
That should raise EnvironmentError though, right?
True, but still worth checking as last resort?
There is another point - schedule systems. Example, on some systems (e.g. Cartesius), MPI is fully controlled by the cluster-wide schedule system, such as SLURM or SGE. If the environment is set up so to only allow multi-processing within the scheduler, running via mpiexec
would not get you anywhere.
I will create a very small, very quick mpi4py
test script that just prints you out that many processors are used. If that script works, then we need to investigate further. If it fails, then it's a problem with starting an MPI-based script on your cluster.
More information on the subject, consider the guide: https://slurm.schedmd.com/mpi_guide.html
Here, it talks about the different MPI-Slurm setup modes. If the cluster is setup in mode 1 (which is common), then mpirun
or mpiexec
won't do much - they would just spawn N separate jobs - no communication, no resource sharing.
In other words:
don't use mpiexec - create a shell script for sbatch
and run the python program via srun
. N as number of nodes is then part of the shell script or part of the sbatch
start command, not the mpiexec
/srun
function.
This file now contains a python test script for MPI that just accumulated the PE ids in an MPI manner. In addition, there are two shell scripts: one that invokes the script in an mpiexec
environment, and one that does it via SLURM. Thus: locally the mpiexec
shell script shall succeed whereas on your HPC cluster I assume it will not output the expected accumulation. On the HPC cluster, on the other hand, the slurm shell script should give the expected output. Looking forward to your feedback.
This is awesome, thank you very much for the help.
You found the error, I was running my script with a shell script for sbatch
but I was still using mpirun
inside of it, rather than srun
. This has solved the problem.
Thank you very much for all your help.
Hello, I'm running a model for ocean litter and because it takes days to finish I am trying to use MPI. After some tests, I realized that the time that takes to finish doesn't improve while running it on parallel.
I see that several processes are launched, but I'm timing each run and the performance doesn't improve (sometimes it even gets worse). Here are some times that I get by running a test example with mupltie processors or just 1:
------- With diffusion --------- Run for 1 day 10 proc |-- Auto --| -- 1024-- | -- 2048 -- (chunksize) time=330 time=445 time=346 time=325 time=495 time=346 time=325 time=480 time=347 time=326 time=500 time=354 .... .... ....
Run for 1 day 1 proc |-- Auto --| -- 1024-- | -- 2048 -- (chunksize) time=240 time=268 time=275
------- Without diffusion --------- Run for 1 day 3 proc |-- Auto --| -- 1024-- time=288 time=289
time=289 time=287
time=299 time=284
Run for 1 day 1 proc |-- Auto --| -- 1024--|-- 2048 -- time=250 time=281 time=263
Do I have to do something besides running it with
mpirun -np # MYFILE.py
?I have also been playing with the chunksize because I was also getting out of memory on HPC (which is mentioned in another issue). At HPC I have parcels 2.1.5 and at home, I have 2.1.4.
A second problem is that when saving the trajectories to a file with
out_parc_file.export()
(that works perfectly with a single proc) it fails when running it on MPI because of 'permission denied' (multiple processes are trying to save the same file at the same time). Is this the right way to save the results when running it on parallel?My code is at https://github.com/olmozavala/ocean_litter and the test file is
T_WorldLItter.py
. If you are interested in running it I can make an effort to build a self-contained test.I'll really appreciate your help. Thanks