Open bedroge opened 7 months ago
libfabric
is essentially irrelevant there (at runtime).libfabric
(since it seems to be a bug there?)build
phase, so bot/build.sh
SitePackage.lua
is put in place via bot/build.sh
, then changes to it should only get picked up by the PR, and should be isolated to that PR?Same approach could be used for other problems that are triggered via libfabric
, see https://github.com/easybuilders/easybuild-easyconfigs/issues/20233
@TopRichard also found an issue with our CUDA hook when trying to use it on NESSI, it will currently forbid the loading of dependency modules that have GPU support even for building purposes. Disabling that hook as part of the bot-specific SitePackage.lua seems like a good idea.
With help from @casparvl, I've added the following to
/project/def-users/bot/shared/host-injections/2023.06/.lmod/SitePackage.lua
on our AWS build cluster, which will be picked up by the bot for builds relying onlibfabric
:This solves the Haswell OpenMPI issues that we observed in several PRs. I was going to make a PR for it, but I have some doubts on how this should be done:
libfabric
?SitePackage.lua
is picked up / copied to the right location?bot/build.sh
,EESSI-install-software.sh
,eessi_container.sh
, ...?SitePackage.lua
, should it already pick up the new version? If so, we should probably prevent it from being copied to the shared directory already, otherwise other builds will also pick it up already before it's merged.