Closed WenMeng-NOAA closed 2 months ago
We are limiting the number of installations/modifications for existing spack-stack releases. Also, as far as I know the UFS applications are going to skip spack-stack-1.7.0 and go straight to 1.8.0 (from 1.6.0). Thus, we will include this update in the spack-stack-1.8.0 release, but not in the already installed 1.7.0 release.
@climbfuji How will this impact the timing of the installs of the g2 and g2tmpl library on RDHPCS machines?
@Hang-Lei-NOAA have you contacted anyone from the SPA team at NCO yet regarding the WCOSS2 installation?
My concern is giving @WenMeng-NOAA enough time to update and properly test UPP prior to the GEFS code freeze.
@climbfuji How will this impact the timing of the installs of the g2 and g2tmpl library on RDHPCS machines?
@Hang-Lei-NOAA have you contacted anyone from the SPA team at NCO yet regarding the WCOSS2 installation?
My concern is giving @WenMeng-NOAA enough time to update and properly test UPP prior to the GEFS code freeze.
If time for testing is a concern, then it's possible to request a test install on a single RDHPCS platform before the spack-stack-1.8.0 release. It's simply not feasible to amend existing spack-stack installations on all systems multiple times per week, and right now we get hammered with such requests from NOAA.
@Andrew Benjamin - NOAA Federal @.***> Following our procedure, I will install them and let modelers testing them on wcoss2 acorn. And then deliver them, due to the NCO rule that installations on wcoss2 cannot be changed. Since the Acorn is down for days. I am figuring out a solution on dogwoods for a personal testing with UPP developers.
On Tue, Jul 9, 2024 at 9:43 AM Dom Heinzeller @.***> wrote:
@climbfuji https://github.com/climbfuji How will this impact the timing of the installs of the g2 and g2tmpl library on RDHPCS machines?
@Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA have you contacted anyone from the SPA team at NCO yet regarding the WCOSS2 installation?
My concern is giving @WenMeng-NOAA https://github.com/WenMeng-NOAA enough time to update and properly test UPP prior to the GEFS code freeze.
If time for testing is a concern, then it's possible to request a test install on a single RDHPCS platform before the spack-stack-1.8.0 release. It's simply not feasible to amend existing spack-stack installations on all systems multiple times per week, and right now we get hammered with such requests from NOAA.
— Reply to this email directly, view it on GitHub https://github.com/JCSDA/spack-stack/issues/1180#issuecomment-2217784032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFC635TKJCTEKZ4DLFDZLPSHPAVCNFSM6AAAAABKQYNGGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJXG44DIMBTGI . You are receiving this because you were mentioned.Message ID: @.***>
@climbfuji if the schedule to install on all R&D platforms is still Q3 2024, that should work for the UPP group. If that were to get pushed back, then we would probably need to do a test install.
@Hang-Lei-NOAA Thanks for the explanation. Let us know when the test area is staged on Dogwood.
@AndrewBenjamin-NOAA The plan is to release spack-stack-1.8.0 end of August/beginning of September and then roll it out. That would mean the new packages will be on all systems in the first 1-2 weeks of September. Thus, if you are referring to calendar years and not fiscal years, that would fit. Nonetheless, I would encourage a test install earlier on one platform so that we know that things work - last thing we want is to redo entire spack-stack installs.
I would encourage a test install earlier on one platform so that we know that things work - last thing we want is to redo entire spack-stack installs.
@climbfuji That makes sense and I agree. I think the best course is to go ahead with the test install on Hera for UPP testing. Is that something you can set up or will the UPP group need to stage the testing area?
We have a spack-stack meeting today- will get back with you after that. Thanks!
@WenMeng-NOAA are you sure you want this under 1.7.0 as opposed to 1.6.0?
I would encourage a test install earlier on one platform so that we know that things work - last thing we want is to redo entire spack-stack installs.
@climbfuji That makes sense and I agree. I think the best course is to go ahead with the test install on Hera for UPP testing. Is that something you can set up or will the UPP group need to stage the testing area?
@climbfuji The test installation on Hera only is not sufficient for the UPP updates. That will break down the UPP support on other R&D platforms.
@AlexanderRichert-NOAA, would you mind clarifying here?
@WenMeng-NOAA are you sure you want this under 1.7.0 as opposed to 1.6.0?
We are limiting the number of installations/modifications for existing spack-stack releases. Also, as far as I know the UFS applications are going to skip spack-stack-1.7.0 and go straight to 1.8.0 (from 1.6.0). Thus, we will include this update in the spack-stack-1.8.0 release, but not in the already installed 1.7.0 release.
We were under the impression that modification of an existing release is not possible.
Given UPP's need to support multiple R&D platforms, would the most likely solution be to have testing areas set up on all platforms UPP supports prior to 1.8.0's release?
@AndrewBenjamin-NOAA sure-- We can create "add-on" environments in each system on top of the unified environment (which is the piece that we don't want to go back and directly modify). So in this case, we could create another chained environment under whatever release is desired which would use g2@3.5.1 and g2tmpl@1.13.0 and rebuild their dependents, with the rest of the packages coming from that release's unified environment. So I'm assuming you'll want 1.6.0 since that's what UPP currently uses (though I don't know how big of a leap it would be to go to 1.7.0 in terms of how many UPP dependencies have changed versions).
@AndrewBenjamin-NOAA sure-- We can create "add-on" environments in each system on top of the unified environment (which is the piece that we don't want to go back and directly modify). So in this case, we could create another chained environment under whatever release is desired which would use g2@3.5.1 and g2tmpl@1.13.0 and rebuild their dependents, with the rest of the packages coming from that release's unified environment. So I'm assuming you'll want 1.6.0 since that's what UPP currently uses (though I don't know how big of a leap it would be to go to 1.7.0 in terms of how many UPP dependencies have changed versions).
@AlexanderRichert-NOAA The installations at the "add-on" environment under 1.6.0 should work for the UPP standalone (offline post). Eventually when ufs-weather-model is updated to 1.8.0, I will update the upp submodule for inline post.
Once these new NCEPLIBS releases are installed, the UPP crew can test against them. If they find a problem, we repeat this whole process.
Imagine a world in which UPP runs unit tests on GitHub, and confirms that new releases of NCEPLIBS work. In that case, all this work would not be needed. Tests would have proceeded within minutes of the NCEPLIBS releases, without involving Hang, Alex, or Andrew. Tests would have run on a computer in Bill Gate's closet, instead of NOAA machines.
Only after everything had been thoroughly tested would we ask for an update on NOAA machines. We would be much less likely to need to fix something and install again. This would be a significant savings for NOAA, the NCEPLIBS team, and the UPP team. Bugs that currently take more than a week to find, could be found within minutes.
help([[]]) conflict("g2tmpl") setenv("g2tmpl_ROOT","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2tmpl/1.13.0") setenv("g2tmpl_VERSION","1.13.0") setenv("G2TMPL_INC","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2tmpl/1.13.0/include") setenv("G2TMPL_LIB","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2tmpl/1.13.0/lib/libg2tmpl.a") whatis("Name: g2tmpl") whatis("Version: 1.13.0") whatis("Category: library") whatis("Description: g2tmpl library")
help([[]]) conflict("g2") setenv("g2_ROOT","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2/3.5.1") setenv("g2_VERSION","3.5.1") setenv("G2_INC4","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2/3.5.1/include_4") setenv("G2_INCd","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2/3.5.1/include_d") setenv("G2_LIB4","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2/3.5.1/lib64/libg2_4.a") setenv("G2_LIBd","/lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/intel-19.1.3.304/g2/3.5.1/lib64/libg2_d.a") whatis("Name: g2") whatis("Version: 3.5.1") whatis("Category: library") whatis("Description: g2 library")
On Tue, Jul 9, 2024 at 5:53 PM Edward Hartnett @.***> wrote:
Once these new NCEPLIBS releases are installed, the UPP crew can test against them. If they find a problem, we repeat this whole process.
Imagine a world in which UPP runs unit tests on GitHub, and confirms that new releases of NCEPLIBS work. In that case, all this work would not be needed. Tests would have proceeded within minutes of the NCEPLIBS releases, without involving Hang, Alex, or Andrew. Tests would have run on a computer in Bill Gate's closet, instead of NOAA machines.
Only after everything had been thoroughly tested would we ask for an update on NOAA machines. We would be much less likely to need to fix something and install again. This would be a significant savings for NOAA, the NCEPLIBS team, and the UPP team. Bugs that currently take more than a week to find, could be found within minutes.
— Reply to this email directly, view it on GitHub https://github.com/JCSDA/spack-stack/issues/1180#issuecomment-2218794518, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFBPRYWB64HWCS4AWKTZLRLURAVCNFSM6AAAAABKQYNGGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJYG44TINJRHA . You are receiving this because you were mentioned.Message ID: @.***>
@Hang-Lei-NOAA and @WenMeng-NOAA Didn't we agree that WCOSS2 specific communications, as long as WCOSS2 is not using spack-stack, will happen in its own repository? It's making it harder for us to track what needs to be done for spack-stack on the other systems. Apologies if I misunderstood previous conversations.
@climbfuji Sure. We will communicate with @Hang-Lei-NOAA for WCOSS2 testing for offline.
Sorry Dom, will pay more attention on it.
On Wed, Jul 10, 2024 at 12:41 PM WenMeng-NOAA @.***> wrote:
@climbfuji https://github.com/climbfuji Sure. We will communicate with @Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA for WCOSS2 testing for offline.
— Reply to this email directly, view it on GitHub https://github.com/JCSDA/spack-stack/issues/1180#issuecomment-2220993439, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFH2VNWEOKCGEGVXN7DZLVPZZAVCNFSM6AAAAABKQYNGGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRQHE4TGNBTHE . You are receiving this because you were mentioned.Message ID: @.***>
Sorry Dom, will pay more attention on it. … On Wed, Jul 10, 2024 at 12:41 PM WenMeng-NOAA @.> wrote: @climbfuji https://github.com/climbfuji Sure. We will communicate with @Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA for WCOSS2 testing for offline. — Reply to this email directly, view it on GitHub <#1180 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFH2VNWEOKCGEGVXN7DZLVPZZAVCNFSM6AAAAABKQYNGGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRQHE4TGNBTHE . You are receiving this because you were mentioned.Message ID: @.>
No problem! Just wanted to make sure whether I remembered correctly (sometimes, no, often, my memory is weak ...)
Bringing this back to our attention: What is the status of installing "add-on" environments on the R&D machines for Wen to test UPP?
I've updated the upp-addon-env's under spack-stack-1.6.0 (per above discussion) on Hera, Jet, Gaea C5, Orion, and Hercules to include g2 3.4.5 and 3.5.1, and g2tmpl 1.12.0 and 1.13.0.
I've updated the upp-addon-env's under spack-stack-1.6.0 (per above discussion) on Hera, Jet, Gaea C5, Orion, and Hercules to include g2 3.4.5 and 3.5.1, and g2tmpl 1.12.0 and 1.13.0.
@AlexanderRichert-NOAA I conducted the UPP test on Hera and confirmed the expected changes with g2/3.5.1 and g2tmpl/1.13.0. Could you also install g2 3.4.5 and 3.5.1, and g2tmpl 1.12.0 and 1.13.0 under the upp-addon-env's of spack-stack-1.6.0 on s4 and noaacloud? Thanks!
Those are both JCSDA platforms (as far as spack-stack maintenance goes), and I don't have access to either. @srherbener @RatkoVasic-NOAA @natalie-perlin would you be able to assist?
We don't have access to S4. @natalie-perlin might do that on cloud when she's back from conference.
@AlexanderRichert-NOAA, I can help with S4, but I'm not sure I fully understand what needs to be done. Would it work for me to simply replicate what was done on Orion on S4? Would I be looking for spack-stack-1.6.0 as the upstream environment, and "upp-addon-env" as the chained environment? It looks like only Intel compiler is supported on S4, so will that be sufficient to do only Intel? Thanks!
Thanks @srherbener. That's correct, only Intel, and yes, we're updating the existing upp-addon-env under spack-stack-1.6.0, which chains to the unified env. I made the installation probably a bit overly elaborate, but here's the idea:
repos: [$env/envrepo]
setting.Thanks @AlexanderRichert-NOAA! I'll let you know if/when questions come up.
@AlexanderRichert-NOAA I have updated S4 and I can see the new g2 and g2tmpl versions. I made it all the way through to the end of updating the chained environment (upp-addon-env) including the lmod refresh and the building of the setup-meta-modules steps.
I think S4 is done, but it would be great if someone who knows what the new environment should look like to test my updates. Thanks!
@RatkoVasic-NOAA can you make sure that g2tmpl-1.13.0/g2-3.5.1 is available on Derecho with spack 1.6.0? New version of g2tmpl-1.13.0 version will be available on WCOSS2 anytime this week. We may update UPP directly with g2tmpl-1.13.0 for https://github.com/ufs-community/ufs-weather-model/pull/2326.
@AlexanderRichert-NOAA I have updated S4 and I can see the new g2 and g2tmpl versions. I made it all the way through to the end of updating the chained environment (upp-addon-env) including the lmod refresh and the building of the setup-meta-modules steps.
I think S4 is done, but it would be great if someone who knows what the new environment should look like to test my updates. Thanks!
@srherbener I am testing a PR ufs-community/ufs-weather-model#2326 on S4. It uses this upp-addon-env that you installed. The compiling of ufs-weather-model is failing with the following error message
Could NOT find PIO (missing: C Fortran) (Required is at least version "2.5.3")
You can see error details in the pull request.
@srherbener The issue on S4 is solved.
Yes, I fixed it
was about to test but you were faster
The issue on S4 was that the spack module lmod refresh
command was run without --upstream-modules
, which is required for chained environments.
I am going to close this issue as completed.
@RatkoVasic-NOAA @AlexanderRichert-NOAA I noticed that configs/common/packages.yaml still lists the old g2/g2tmpl versions. I thought 1.8.0 should use 3.5.1 and 1.13.0.
@RatkoVasic-NOAA @AlexanderRichert-NOAA I noticed that configs/common/packages.yaml still lists the old g2/g2tmpl versions. I thought 1.8.0 should use 3.5.1 and 1.13.0.
@climbfuji Good catch, I'll open PR.
Package name
g2 and g2tmpl
Package version/tag
3.5.1 for g2; 1.13.0 for g2tmpl
Build options
None
Installation timeframe
For GEFS v13 development, the UPP updates require g2 v3.5.1 and g2tmpl v1.13.0 installations.
Other information
No response
WCOSS2
WCOSS2: General questions
No response
WCOSS2: Installation and testing
No response
WCOSS2: Technical & security review list
WCOSS2: Additional comments
No response