Closed AlexanderRichert-NOAA closed 9 hours ago
I am wondering if it makes sense to wait until we've relaxed all versions that we don't really care about?
That's fine with me... Would it make sense to go through it in its current state and see what we actually need?
That's fine with me... Would it make sense to go through it in its current state and see what we actually need?
Yes, absolutely - if you don't mind, please go ahead! Thanks very much.
@AlexanderRichert-NOAA You may want to hold back until we've repaired the Ubuntu CI runner
@AlexanderRichert-NOAA ubuntu ci is good again. macos ci may be shaky now, but ignore it please
@srherbener @RatkoVasic-NOAA @mathomp4 FYI
I'm willing to try it out...if I can learn how. 😄 I'm currently at the "I know how to load and build with spack-stack" stage. I guess I need to advance to the "test a PR" stage...
@climbfuji this seems to be in pretty good shape. Ideally it would be good to run at least a concretization on each platform just make sure it doesn't fail and to check for duplicates.
@climbfuji this seems to be in pretty good shape. Ideally it would be good to run at least a concretization on each platform just make sure it doesn't fail and to check for duplicates.
Thanks @AlexanderRichert-NOAA. I'll create a test environment for NEPTUNE on one of the NRL machines to run some tests; I encourage others (@RatkoVasic-NOAA @srherbener @mathomp4) to do the same with their applications.
FYI I'm having issues on Nautilus which uses mkl as blas, fftw-api and lapack provider. Something insists on netlib-lapack and doesn't want to use mkl (that's not the case for develop, so must be a new/different version of one of the packages).
FYI I'm having issues on Nautilus which uses mkl as blas, fftw-api and lapack provider. Something insists on netlib-lapack and doesn't want to use mkl (that's not the case for develop, so must be a new/different version of one of the packages).
I resolved those issues for the unified environment on Nautilus with Intel (classic). No clean solution that works for all compilers. Now running into issues with the neptune standalone environment :-(
I tried building on my Mac - I have Sonoma 14.4.1 with apple-clang%15.0.0 and I'm running into the musl build issue again. Concretize successfully completed and I see crtm, esmf and fms show up in the duplicates list:
MacBook-Pro-5:unified-env.mymacos steveherbener$ ../../util/show_duplicate_packages.py -d log.concretize
z35pjfx crtm@2.4.0.1%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +fix~ipo build_system=cmake build_type=Release generator=make arch=darwin-sonoma-m1
p563nq7 crtm@v2.4.1-jedi%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +fix~ipo build_system=cmake build_type=Release generator=make arch=darwin-sonoma-m1
frvnpsl crtm@v2.4-jedi.2%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +fix~ipo build_system=cmake build_type=Release generator=make arch=darwin-sonoma-m1
bzhk57c esmf@8.6.1%apple-clang@15.0.0 ldflags=-Wl,-ld_classic ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf+shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=darwin-sonoma-m1
omhbmud esmf@8.7.0b04%apple-clang@15.0.0 ldflags=-Wl,-ld_classic ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf+shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=b04 arch=darwin-sonoma-m1
gvbdrwc fms@release-jcsda%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +gfs_phys+internal_file_nml~ipo~large_file+openmp+quad_precision build_system=cmake build_type=Release generator=make precision=32 arch=darwin-sonoma-m1
h5vv6ws fms@2023.04%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +deprecated_io+gfs_phys+internal_file_nml~ipo~large_file+openmp+pic+quad_precision~yaml build_system=cmake build_type=Release constants=GFS generator=make precision=32,64 arch=darwin-sonoma-m1
===
Duplicates found!
Are these the expected duplicates?
Also I can try testing on S4 for JCSDA. @climbfuji what is the method for doing this. Do I become the jedipara user and then build in a directory off to the side, test the installation out, then remove the build? Thanks!
I tried building on my Mac - I have Sonoma 14.4.1 with apple-clang%15.0.0 and I'm running into the musl build issue again. Concretize successfully completed and I see crtm, esmf and fms show up in the duplicates list:
MacBook-Pro-5:unified-env.mymacos steveherbener$ ../../util/show_duplicate_packages.py -d log.concretize z35pjfx crtm@2.4.0.1%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +fix~ipo build_system=cmake build_type=Release generator=make arch=darwin-sonoma-m1 p563nq7 crtm@v2.4.1-jedi%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +fix~ipo build_system=cmake build_type=Release generator=make arch=darwin-sonoma-m1 frvnpsl crtm@v2.4-jedi.2%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +fix~ipo build_system=cmake build_type=Release generator=make arch=darwin-sonoma-m1 bzhk57c esmf@8.6.1%apple-clang@15.0.0 ldflags=-Wl,-ld_classic ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf+shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=darwin-sonoma-m1 omhbmud esmf@8.7.0b04%apple-clang@15.0.0 ldflags=-Wl,-ld_classic ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf+shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=b04 arch=darwin-sonoma-m1 gvbdrwc fms@release-jcsda%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +gfs_phys+internal_file_nml~ipo~large_file+openmp+quad_precision build_system=cmake build_type=Release generator=make precision=32 arch=darwin-sonoma-m1 h5vv6ws fms@2023.04%apple-clang@15.0.0 ldflags=-Wl,-ld_classic +deprecated_io+gfs_phys+internal_file_nml~ipo~large_file+openmp+pic+quad_precision~yaml build_system=cmake build_type=Release constants=GFS generator=make precision=32,64 arch=darwin-sonoma-m1 === Duplicates found!
Are these the expected duplicates?
Also I can try testing on S4 for JCSDA. @climbfuji what is the method for doing this. Do I become the jedipara user and then build in a directory off to the side, test the installation out, then remove the build? Thanks!
The musl issue is related to your libiconv not being found/accepted as an external package and as the only provider for iconv if I recall correctly. Not related to this PR. For S4, why jedipara? You can test with your own user as well, or?
Also, @srherbener please note that there is a follow-up PR #1159 that you should use for testing; it will eventually be merged into this PR.
Also, @srherbener please note that there is a follow-up PR #1159 that you should use for testing; it will eventually be merged into this PR.
Also, @srherbener please note that there is a follow-up PR #1159 that you should use for testing; it will eventually be merged into this PR.
@climbfuji using packages:all:prefer:['%oneapi']
took care of the oneapi issues, in that gcc-runtime and bison use %gcc but everything else uses %oneapi.
@climbfuji using
packages:all:prefer:['%oneapi']
took care of the oneapi issues, in that gcc-runtime and bison use %gcc but everything else uses %oneapi.
If the compiler you want to build with is oneapi
, I assume?
Yes
Yes
Thanks. I'll add this to spack stack create
when I finally get to work on the single compiler logic
@AlexanderRichert-NOAA, @climbfuji I am building spack-stack on S4 now. I had the --source
option on the install command (as written in the documentation) and rapidly used up my quota of inodes. I'm trying now without the --source
option.
The concretize step worked just fine. Install is running.
jedi-bundle is currently broken so I'm not sure when I can test that and skylab. Perhaps it's best to call it good with the successful concretize, and see if the install works. I don't want to unnecessarily hold up this PR.
What do you think?
@AlexanderRichert-NOAA, @climbfuji I am building spack-stack on S4 now. I had the
--source
option on the install command (as written in the documentation) and rapidly used up my quota of inodes. I'm trying now without the--source
option.The concretize step worked just fine. Install is running.
jedi-bundle is currently broken so I'm not sure when I can test that and skylab. Perhaps it's best to call it good with the successful concretize, and see if the install works. I don't want to unnecessarily hold up this PR.
What do you think?
We should update the instructions to not use --source
by default. I am ok with moving ahead once your install finishes.
I am testing on Discover too. concretize gets a failure:
sherbene@discover12:/discover/nobackup/sherbene/projects/spack-stack/envs/unified-dev.discover-scu16> spack concretize --force --fresh 2>&1 | tee log.concretize
==> Error: concretization failed for the following reasons:
1. cannot satisfy a requirement for package 'flex'.
==> Fetching https://mirror.spack.io/bootstrap/github-actions/v0.5/build_cache/linux-centos7-x86_64-gcc-10.2.1-clingo-bootstrap-spack-bhqgwuvef354fwuxq7heeighavunpber.spec.json
==> Fetching https://mirror.spack.io/bootstrap/github-actions/v0.5/build_cache/linux-centos7-x86_64/gcc-10.2.1/clingo-bootstrap-spack/linux-centos7-x86_64-gcc-10.2.1-clingo-bootstrap-spack-bhqgwuvef354fwuxq7heeighavunpber.spack
==> Installing "clingo-bootstrap@=spack%gcc@=10.2.1~docs+ipo+optimized+python+static_libstdcpp build_system=cmake build_type=Release generator=make patches=bebb819,ec99431 arch=linux-centos7-x86_64" from a buildcache
I think the issue is that common/packages.yaml has:
flex:
# Pin version to avoid duplicates
require: '@2.6.4'
while site/packages.yaml has:
flex:
# Must set buildable: false to avoid duplicate packages
buildable: false
externals:
- spec: flex@2.5.37+lex
prefix: /usr
For now I can make the flex version match in common/packages.yaml in my environment area. But does that mean that we need to change something in this PR?
I am testing on Discover too. concretize gets a failure:
sherbene@discover12:/discover/nobackup/sherbene/projects/spack-stack/envs/unified-dev.discover-scu16> spack concretize --force --fresh 2>&1 | tee log.concretize ==> Error: concretization failed for the following reasons: 1. cannot satisfy a requirement for package 'flex'. ==> Fetching https://mirror.spack.io/bootstrap/github-actions/v0.5/build_cache/linux-centos7-x86_64-gcc-10.2.1-clingo-bootstrap-spack-bhqgwuvef354fwuxq7heeighavunpber.spec.json ==> Fetching https://mirror.spack.io/bootstrap/github-actions/v0.5/build_cache/linux-centos7-x86_64/gcc-10.2.1/clingo-bootstrap-spack/linux-centos7-x86_64-gcc-10.2.1-clingo-bootstrap-spack-bhqgwuvef354fwuxq7heeighavunpber.spack ==> Installing "clingo-bootstrap@=spack%gcc@=10.2.1~docs+ipo+optimized+python+static_libstdcpp build_system=cmake build_type=Release generator=make patches=bebb819,ec99431 arch=linux-centos7-x86_64" from a buildcache
I think the issue is that common/packages.yaml has:
flex: # Pin version to avoid duplicates require: '@2.6.4'
while site/packages.yaml has:
flex: # Must set buildable: false to avoid duplicate packages buildable: false externals: - spec: flex@2.5.37+lex prefix: /usr
For now I can make the flex version match in common/packages.yaml in my environment area. But does that mean that we need to change something in this PR?
You should submit a site config update as a follow-up PR, yes. I expect we need a few more of those.
Turns out that /usr/bin/flex
on Discover is already version 2.6.4, so I updated the site/packages.yaml file (to version 2.6.4) and concretize successfully finished. The install is running now on Discover (and the one on S4 is still running).
Do we still want to handle this update in a subsequent PR, or change this one. Here's the change I made in site/packages.yaml:
flex:
# Must set buildable: false to avoid duplicate packages
buildable: false
externals:
- spec: flex@2.6.4 <---- updated version
prefix: /usr
@AlexanderRichert-NOAA Would you mind changing the flex version in the discover site config, as noted by @srherbener, before merging?
I noticed that discover-scu17 already has this:
flex:
# Must set buildable: false to avoid duplicate packages
buildable: false
externals:
- spec: flex@2.6.4+lex
prefix: /usr
Does the scu16 config also need the "+lex" suffix? I had started a feature branch, but I think it would be simpler to make the change in this PR and be done with it. Thanks!
I don't know how it was built on scu16
Well with my testing we know that it works without the suffix, so we should probably leave it that way. I'll make a PR out of my feature branch and if @AlexanderRichert-NOAA decides to update this one, we can close the one I make. Thanks!
I see flex@2.6.4 in both discover configs. Is there a further change that needs to get made?
No, I merged the update from Steve in the meanwile. I also updated your branch from develop and enabled automerge - just waiting for CI to run.
:cowboy_hat_face:
Summary
This PR updates common/packages.yaml to remove unused entries, unpin versions that don't need to be pinned, and apply hard requirements as much as possible.
Status:
99% of the entries now use
require:
.becomes
Then if someone wants to use a different version in their own environment (or site config), they have to override it (note the ::):
Testing
Make sure to check for duplicates!
Applications affected
All
Systems affected
All
Dependencies
none
Issue(s) addressed
Fixes #1115
Checklist