esm-tools / esm_tools

Simple Infrastructure for Earth System Simulations
https://esm-tools.github.io/
GNU General Public License v2.0
26 stars 12 forks source link

Use mpirun on glogin/blogin (SLURM) #1208

Open joakimkjellsson opened 3 months ago

joakimkjellsson commented 3 months ago

Good afternoon all

glogin (GWDG Emmy) has undergone some hardware and software upgrades recently. Since the upgrade, I find jobs launched with srun are considerably slower than jobs launched with mpirun. The support team recommends mpirun. So I'd like to use mpirun.

But I can't work out if ESM-Tools can do it. There is an mpirun.py file with a function to write a hostfile for mpirun, but as far as I can see this function is never used. If we use SLURM, then it seems that ESM-Tools will always build a hostfile_srun and then launch with srun.

My idea would be to have something like this in slurm.py: Line 65 is currently:

write_one_hostfile(current_hostfile, config)

but it should be

if launcher == 'srun':
   write_one_hostfile_srun(current_hostfile, config)
elif launcher == 'mpirun':
   write_one_hostfile_mpirun(current_hostfile, config)
else:
   print(' ESM-Tools does not recognise the launcher ', launcher)
   print(' The launchers supported are srun and mpirun')

and then the two functions would be slightly different. One benefit with mpirun would be that heterogeneous parallelisation becomes very easy since we can do:

mpirun OMP_NUM_THREADS=4 -np 288 ./oifs -e ECE3 : -np 432 ./oceanx : -np 20 ./xios.x 

although I'm not sure and would have to double-check exactly how it should be done on glogin.

Before I venture down this path though, I just want to check: Is it already possible to use mpirun but I'm just too dense to figure out how? If not, is someone else already working on a similar solution?

Cheers Joakim

pgierz commented 3 months ago

Hi @joakimkjellsson,

did you try setting computer.launcher to mpirun? You can do that in your runscript. That will swap out your srun <OPTIONS> to use mpirun instead.

I'd need to look more deeply into how to set the actual options. That would need a code change.

joakimkjellsson commented 3 months ago

Hi @pgierz Sorry forgot to mention this. So if I do that (launcher: mpirun and launcher_flags: "") my launch command becomes:

time mpirun  $(cat hostfile_srun) 2>&1 &

so it would use mpirun but give executables in the format expected by srun. At the moment, hostfile_srun is:

0-287  ./oifs -e ECE3
288-719  ./oceanx
720-739  ./xios.x
740-740  ./rnfma

but I would need it to be

-np 288 ./oifs -e ECE3 : -np 432 ./oceanx : -np 20 ./xios.x : -np 1 ./rnfma

The function write_one_hostfile in mpirun.py seems to do that, but it never gets called. Almost as if someone started working on this but never finished ;-) I would like to have two functions, write_one_hostfile_srun and write_one_hostfile_mpirun, and have some kind of if statement in slurm.py to choose which one to use.

/J

pgierz commented 3 months ago

@joakimkjellsson What branch are you on? I'll start from that one, should be quick enough to program.

joakimkjellsson commented 3 months ago

@pgierz no worries. I've already coded it in. My main question was whether someone had already done it or was planning to do it, in which case I would not do it :-)

I renamed the old write_one_hostfile to write_one_hostfile_srun and made a new write_one_hostfile:

def write_one_hostfile(self, hostfile, config):
        """ 
        Gathers previously prepared requirements
        (batch_system.calculate_requirements) and writes them to ``self.path``.
        Suitable for mpirun launcher
        """

        # make an empty string which we will append commands to
        mpirun_options = ""

        for model in config["general"]["valid_model_names"]:
            end_proc = config[model].get("end_proc", None)
            start_proc = config[model].get("start_proc", None)
            print(' model ', model)
            print(' start_proc ', start_proc)
            print(' end_proc ', end_proc)

            # a model component like oasis3mct does not need cores
            # since its technically a library
            # So start_proc and end_proc will be None. Skip it
            if start_proc == None or end_proc == None:
                continue

            # number of cores needed
            no_cpus = end_proc - start_proc + 1
            print(' no_cpus ',no_cpus)

            if "execution_command" in config[model]:
                command = "./" + config[model]["execution_command"]
            elif "executable" in config[model]:
                command = "./" + config[model]["executable"]
            else:
                continue

            # the mpirun command is set here. 
            mpirun_options += (
                    " -np %d %s :" % (no_cpus, command)
                )

        mpirun_options = mpirun_options[:-1]  # remove trailing ":"

        with open(hostfile, "w") as hostfile:
            hostfile.write(mpirun_options)

Already made a few test runs and it seems to work. I'll do some more tests. Then it will end up in the feature/blogin-rockylinux9 branch, where I'm trying to get FOCI-OpenIFS running on glogin.

/J

mandresm commented 3 months ago

Perfect, thanks for figuring that out. Let us know when you are ready to merge and we can see if we can improve in terms of generalization of the write_one_hostfile function.

joakimkjellsson commented 3 months ago

I made the change to slurm.py: https://github.com/esm-tools/esm_tools/commit/058fcf9892048e6833efc66a92aa9bc8d74d1f70#diff-0c204676837e94ca027f7a61a71d27914ea3a6b8071d5d3dc4c7791dfa5eb15b

When Sebastian is back we might do some cleaning etc and then merge this fix branch into geomar_dev. Then that can merge into release.

Cheers! /J