easybuilders / easybuild-easyconfigs

A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
374 stars 699 forks source link

HPL building, issues between foss/2021a and gomkl/2021a #13426

Open jhein32 opened 3 years ago

jhein32 commented 3 years ago

Hi,

I have installed foss/2021a and intel/2021a, with the matching HPL (which I use for basic testing). I now like to move gomkl/2021a. When trying to build gomkl using the --robotoption on the HPL-2.3-gomkl-2021a.ebconfig I get the following for a dry-run:

...
 * [x] /sw/easybuild/software/EasyBuild/4.4.1/easybuild/easyconfigs/g/gompi/gompi-2021a.eb (module: Core | gompi/2021a)
 * [ ] /sw/easybuild/software/EasyBuild/4.4.1/easybuild/easyconfigs/i/imkl/imkl-2021.2.0-gompi-2021a.eb (module: MPI/GCC/10.3.0/OpenMPI/4.1.1 | imkl/2021.2.0)
 * [ ] /sw/easybuild/software/EasyBuild/4.4.1/easybuild/easyconfigs/g/gomkl/gomkl-2021a.eb (module: Core | gomkl/2021a)
 * [x] /sw/easybuild/software/EasyBuild/4.4.1/easybuild/easyconfigs/h/HPL/HPL-2.3-gomkl-2021a.eb (module: MPI/GCC/10.3.0/OpenMPI/4.1.1 | HPL/2.3)
== Temporary log file(s) /tmp/eb-DzXosw/easybuild-n4qoAu.log* have been removed.
== Temporary directory /tmp/eb-DzXosw has been removed.

I have used the command: eb HPL-2.3-gomkl-2021a.eb --robot --use-existing-modules --dry-run

Easybuild seems to confuse the existing HPL for foss/2021a with the HPL for gomkl/2021a and claims it is there already. We are using hierachical modules (compiler, MPI) and both foss and gomkl use the same version of GCC and OpenMPI. So they sit in the same spot in the module tree and from the module name they can not be told apart. It's only internally calling scalapack or MKL for it's BLAS needs.

While for HPL this is not a real issue, it will be one if "real" packages are offered in foss or gomkl.

I am using EB 4.4.1 but build HPL for foss/2021a and foss/2021a with EB 4.4.0

Micket commented 3 years ago

This seems unfixable, an inherit flaw in HMNS. The only conceivable fix would be to alter HMNS extensively, probably requiring a complete rebuild of modules.

Edit: There is nothing we can do in the easyconfigs about this, it's really a framework issue.

jhein32 commented 3 years ago

This seems unfixable, an inherit flaw in HMNS. The only conceivable fix would be to alter HMNS extensively, probably requiring a complete rebuild of modules.

Edit: There is nothing we can do in the easyconfigs about this, it's really a framework issue.

You suggest to raise a ticket in framework? I would suggest we wait what management (e.g. @boegel) thinks and take it up after the summer. I doubt anything else but HPL is concerned right now. Having a ticket here makes sure the issue is captured.

Micket commented 3 years ago

We only have HPL for gomkl currently. Anyone hoping to add more stuff under gomkland foss is going to have the same issue for all software. It's yet another bug with HMNS, and a pretty bad one at that. Edit: but yes, it should rather go into framework, since it needs to be fixed in HMNS)

jhein32 commented 3 years ago

I agree it is a pretty bad bug, but it will only become an issue if we over the same piece of software (incl. version) in foss/2021a and gomkl/2021a. If the software is offered only in foss or gomkl but not both, it will work. I think right now chances for that are slim.

We sometimes have the same software (incl. version) in foss and intel, but that will work as well.

So it needs resolving, but it is not a burning issue in my view.

Micket commented 3 years ago

If you actually want to use both toolchains, the likelyhood of a collision is pretty much 100%. You can't have a single dependency that collides, so, nothing that indirectly dpends on numpy, R, or anything else under foss. So, it effectively means you have to pick a side if using HMNS; only foss, or only gomkl (or any other blas variants, though those toolchains are even less popular than gomkl).

We sometimes have the same software (incl. version) in foss and intel, but that will work as well.

I don't understand what you mean here. Yes this bug in HMNS only occurs when only the math libraries differ. intel + foss works because they differ on compiler and mpi.

jhein32 commented 3 years ago

If you actually want to use both toolchains, the likelyhood of a collision is pretty much 100%. You can't have a single dependency that collides, so, nothing that indirectly dpends on numpy, R, or anything else under foss. So, it effectively means you have to pick a side if using HMNS; only foss, or only gomkl (or any other blas variants, though those toolchains are even less popular than gomkl).

I think I start to see what you mean. Even when the final software is only to be rolled out in e.g. gomkl, that piece of software might depent on a blas dependent library that we need in foss for build projects within foss.

We sometimes have the same software (incl. version) in foss and intel, but that will work as well.

I don't understand what you mean here. Yes this bug in HMNS only occurs when only the math libraries differ. intel + foss works because they differ on compiler and mpi.

I just meant that the issues we are dicussing for foss vs gomkl do not exist for foss vs intel. So you picked it up perfectly well.