C2SM / spack-c2sm

Repository for c2sm spack config and repo files
https://c2sm.github.io/spack-c2sm/latest
MIT License
7 stars 19 forks source link

How to move on with upgrade to v0.21.1 #960

Closed jonasjucker closed 4 months ago

jonasjucker commented 4 months ago

According to @bcumming the creation of a slightly different V6 sw-stack using stackinator is not possible, because an outdated version of stackinator has been used to create V6. Also the requirements to come-up with a new production environment for MCH are too high to meet the needs of C2SM with respect to upgrade spack to v0.21.1 in a reasonable period of time.

Therefore there is no way around to split the spack-c2sm instance in two branches in order to allow more development towards ALPS and upgraded Euler. The following PRs are currently blocked :

The split of spack-c2sm would looks as follows:

In order to still test ICON builds on Balfrin, the following commit could be added to v0.21.1: https://github.com/C2SM/spack-c2sm/pull/909/commits/1d7a27a01464cc7e27997d74fa747acd7a32854f

This commit removes the conflict of xpmem-package that was not there in spack v0.19.1. By doing so Icon builds fine with V6 and v0.21.1 on Balfrin. Since xpmem is an external packages this change only affects the concretization, but the underlying software remains identical.

Initially this was proposed as a solution by @jonasjucker that would prevent a divergence of spack-c2sm, but @dominichofer raised concerns with regard to maintainability, since none of us knows the xpmem package.

I propose that the branch with v0.21.1 will become the main branch, whereas the branch with v0.20.1 will live in a branch. All required changes relevant for production at MCH should go into both branches, whereas infrastructure changes should only go to main branch.

It is important to come up with a timeline how to bring these two branches together already now.

bcumming commented 4 months ago

To test building the MCH software stack using Spack v0.21, I would recommend using a new version of the MCH stack that has been built using Spack v0.21.

The v6 stack is built using v0.19, and a custom version of stackinator that had v0.19 support back-ported. But we could upgrade to v0.21.1 and use mainline stackinator.

All testing should be done using the uenv tools to mount the image dynamically at /user-environment, instead of using a permanent mount point like /mch-environmengt/v6. This allows us to iterate more quickly.

dominichofer commented 4 months ago

By doing so Icon builds fine with V6 and v0.21.1 on Balfrin

Is it this PR? https://github.com/C2SM/spack-c2sm/pull/958

jonasjucker commented 4 months ago

@dominichofer No it is not, in #958 you still run into the issue that xpmem is concretized with %gcc, altough in V6 it is concretized with %nvhpc. And since the conflict is in v0.21.1 no way the concretizer will concretize to %nvhpc.

I put https://github.com/C2SM/spack-c2sm/commit/1d7a27a01464cc7e27997d74fa747acd7a32854f into branch dev_v0.21.1 directly and subsequently ICON worked with V6 and v0.21.1.

See these tests: https://github.com/C2SM/spack-c2sm/pull/909#issuecomment-2122474372 I could simply reintroduce it again and we are fine.

Is it clearer now?

jonasjucker commented 4 months ago

Actually https://github.com/C2SM/spack-c2sm/commit/1d7a27a01464cc7e27997d74fa747acd7a32854f simply deactivates the conflict, that was not present in v0.19.x:

# All compilers except for gcc are in conflict with +kernel-module:
requires("%gcc", when="+kernel-module", msg="Linux kernel module must be compiled with gcc")

to

# All compilers except for gcc are in conflict with +kernel-module:
#requires("%gcc", when="+kernel-module", msg="Linux kernel module must be compiled with gcc")

I have to do this by copying the package into our local repos to overwrite the defaults.

jonasjucker commented 4 months ago

@dominichofer came up with a workaround for the concretizer in #961.

spack-c2sm main can now be updated to spack v0.21.1, since ICON can still be built with V6.

jonasjucker commented 4 months ago

solved by #909