husseinaluie / FlowSieve

FlowSieve coarse-graining code base
https://flowsieve.readthedocs.io/en/latest/
Other
18 stars 9 forks source link

About Parallelization #39

Closed Yizhaofeng closed 4 months ago

Yizhaofeng commented 4 months ago

Hello, I am here for another question agian! For example, if I have Ntime = 30 (daily resolution) and Ndepth =1. The submit script in Slurm system as follows:

!/bin/bash

SBATCH --output=sim-%j.out

SBATCH --error=sim-%j.err

SBATCH --ntasks=5

SBATCH --cpus-per-task=64

mpirun -n ${SLURM_NTASKS} ./coarse_grain.x \ --Nprocs_in_time "5"\

Does this mean that I use 5 processes with 64 cores per process to compute, and one process represents a day of computation, there are 5 processes simultaneously starting 5 days of computation, 30/5=6 simultaneous calculations of 5 processes are required until the complete execution is complete, and each day of computation uses 64 cores to compute?

bastorer commented 4 months ago

Sorry for the slow response!

Yep, your interpretation is correct! In general the parallelization is more efficient with threading (cpus-per-task) than MPI (ntasks), so having cpus-per-task equal to the total number of cores on the physical compute node is usually the most efficient usage.