ec-jrc / lisflood-calibration

Lisflood OS (Calibration tool)
https://ec-jrc.github.io/lisflood-calibration/
9 stars 8 forks source link

Clarification on Core Consumption for CAL_7A and CAL_7B Calibration on HPC #12

Open Nooshdokht-Bayatafshary opened 3 months ago

Nooshdokht-Bayatafshary commented 3 months ago

Dear Developers,

I am attempting to perform model calibration (CAL_7A and CAL_7B) on an HPC system. According to the system's guidelines, the number of requested cores (ppn) must be greater than or equal to the consumed cores. However, I encountered an issue: when I requested 6 cores and set the input cores for CAL_7A execution to 6 (using python -u CAL_7A_CALIBRATION.py SETTINGS STATION_ID 6), the HPC admin informed me that the requested cores were fewer than the consumed cores.

I am seeking your assistance in estimating or calculating the core consumption of the model during the calibration run for both CAL_7A and CAL_7B. Specifically, I am unsure how the parameters _numCPUsparallelKinematicWave, _numCPUsparallelNumba, and _NCPUS, or any other parameters in CAL_7A, relate to total core usage.

Your guidance on this matter would be greatly appreciated. Thank you for your support.

Best regards, Nooshdokht

doc78 commented 2 months ago

Dear @Nooshdokht-Bayatafshary

CAL7A executes the DEAP Algorithm to find the better Lisflood parameters for a specific catchment. To do so it runs in parallel N Lisflood processes, and each of this process should run on just one core. To do so, you should set numCPUs_parallelKinematicWave to 1 and numCPUs_parallelNumba to 1 (starting from Lisflood version 4.3 it is actually required just the numCPUs_parallelNumba parameter, since numCPUs_parallelKinematicWave parameter is no longer used). N should be equal to the number of cores, thus you will end up filling all the cores.

CAL7B is the longterm run, and is running only one instance of lisflood, thus you should set numCPUs to the number of available cores or to "0" to automatically use all the available cores (see also https://github.com/ec-jrc/lisflood-code/blob/fe94c33b6bd9fe09fa7be43a6dd4fdc8a11f744c/src/lisfloodSettings_reference.xml#L166 )

Please note that this is not a general rule, since for some small catchment using large amount of cores (e.g. 128 or higher) could actually reduce performances, i.e. the system will end up consuming more resources to split the jobs compared to the actual time required to run the parallelized job. So, best performances of a lisflood simulation run actually depends highly on your HW and your catchment size.

Furthermore, since DEAP algorithm needs to wait the end of all the population runs for each generation, you can optimize the calibration by using a population that is equal or a multiple of the number of cores. E.g. for a 32 cores node you can use pop = 64 in the settings file.

Best Regards Carlo