Open mboisson opened 5 years ago
I think --use-existing-modules
would solve this for you.
That option indeed solves the issue. What other side effect does --use-existing-modules
have though ?
None, it just means if you have an existing installation higher up the hierarchy it will prefer it.
It was created for zlib, which you would normally have at GCCcore but you also might want at iccifort level since it is highly tuned there.
Does it still respect minimal toolchains within the installed modules ? Say there is a version with MPI and one without (for whatever reason).
In a hierarchy that's the same difference as GCCcore/iccifort so no, it would choose the MPI version (as long as the target software is using the MPI toolchain or higher). If there is some strange corner case where that's not appropriate, you could always explicitly indicate the required toolchain in the easyconfig.
Ok. We can use use-existing-modules
as a workaround, but I would argue this should be the default behavior, not an exception.
v4.0 is the time to make your case ;)
@boegel, thoughts ?
Without --use-existing-modules
the --minimal-toolchains
feature is broken as soon as a recipe that has a lower toolchain appears in the easyconfig repository, even if that recipe is not installed on the current system.
Actually, that doesn't sound right. It doesn't/shouldn't fail, it will just resolve to the GCCcore and then install it (if you use the robot). It's not true that once another recipe appears it will fail, in fact this is expected for use-existing-modules
to do anything useful.
I see in your error it's looking for Core/metis/5.1.0
, so with the system toolchain. That seems odd, what does a dry run look like, can that resolve?
Correction. It will fail if you don't use --robot
(we never use --robot
).
I guess my point is that "minimal toolchain" should not depend on existing recipes (i.e. in the repo), but rather on installed recipes if a match is found. Otherwise, if a new recipe is created at a the level of a lower toolchain, it will fail existing recipes until that new recipe is installed.
Is that what you want locally? Because here we definitely want it to find the minimal-toolchain recipe and build it... (using --robot of course)
The problem with setting use-existing-modules
on by default is that it makes the order of commands matter for the end result. If something depends on zlib, which has easyconfigs for GCCcore
and iccifort
, I will get a different result with the same command depending on whether or not the iccifort
zlib is installed or not.
For that reason I think the default as is is correct, and that it really is an "expert mode" option.
@akesandgren yes. Here, we don't want to build software that is not needed. If there is already a match installed, it should get used. It should not install a new recipe that is "more minimal" just because somebody created a recipe for it.
@ocaisa, the problem with not having use-existing-modules
is that the result will vary over time, based on what new recipe is added to the git repository. Once somebody adds METIS-5.1.0-GCCcore-7.3.0.eb
there is no going back to using the (in my opinion) correct version METIS-5.1.0-GCC-7.3.0.eb
and METIS-5.1.0-iccifort-2018.3.eb
This is precisely what happened to me. A recipe that used to work now ended up not working because it suddenly found METIS-5.1.0-GCCcore-7.3.0.eb
that did not exist in the past.
@mboisson True. That is also true without the use minimal-toolchains
, if Metis at GCCcore existed first and afterwards we add Metis at GCC level, then the default resolution mechanism would prefer the GCC version even if it is not installed. The only way to avoid this would be to restrict your robot-paths
to the installed easyconfigs (we do this at JSC). You can still add new easyconfigs to your search-paths
(introduced in #2255).
I would argue that when --robot
is not used, it should by default resolve to what is installed, it should not consider what is possible ?
You could potentially do that but you would have to assume that what lives in your eb_repo
is an accurate reflection of what is installed (or indicate the "golden" repo in some other way, which we do via controlling robot-paths
)
True. In our case, eb_repo
is intended to represent what is installed, and it is pretty much the case (I'ld say with 99% accuracy).
I'm not convinced that we should change the default for --use-existing-modules, regardless of whether
--minimal-toolchains` is used or not, since you could argue for both having it enabled and disabled by default, it depends on your expectations.
Note that also when --use-existing-modules
is enabled, the result could vary over time, since installing additional modules high/low in the hierarchy could affect which exact modules are use to resolve a dependency.
There simply isn't a good default for this, but we have to pick a default. We picked to not have it enabled by default, so available easyconfig files overrule available module, and it's easy to flip that around if you want to. I don't think we can do much better...
But don't you think that the current default is broken when you are not using --robot
? My expectation is that adding a new recipe in the repository should not break a recipe that once worked.
Another of our staff ran into this issue today. An alternative solution si to say that METIS should not have been merged with GCCcore, and remove that recipe. GCCcore does not benefits from all optimizations (at least in our case, maybe it is not the case for others ?), hence it produces a METIS which is under-performing.
When using
minimal-toolchains
, EasyBuild looks for an EasyConfig, without validating that there is a corresponding installed module. For example, we have a recipe that depends onEasyBuild finds that the git repository has a recipe
This recipe is however not installed on our system. We have
installed instead. This last recipe is compatible with the recipe that we are trying to install now (which uses gompi,2018.3.312 which has GCC 7.3.0 in it).
With
--minimal-toolchains
, it barfs on the following :@bartoldeman