ufs-community / ufs-srweather-app

UFS Short-Range Weather Application
Other
56 stars 118 forks source link

Build, workflow fails on Cheyenne #3

Closed benjamin-cash closed 4 years ago

benjamin-cash commented 4 years ago

I checked out the code following the instructions in the wiki, but I was unable to build the model. I'm not 100% if that should be possible on Cheyenne at this point - if not go ahead and ignore this. The log is here: /glade/work/bcash/ufs-srweather-app/src/logs/build_forecast.log

I also attempted to run the workflow independently of the build, and that failed as well with the following error message.

The system directory in which to look for the files generated by the ex- ternal model specified by EXTRN_MDL_NAME_LBCS has not been specified for this machine and external model combination: MACHINE = "CHEYENNE" EXTRN_MDL_NAME_LBCS = "FV3GFS" Exiting with nonzero status.

JulieSchramm commented 4 years ago

It seems to be failing in the ccpp_prebuild.py script. Can you check out a new copy and make sure you are in the src directory before running build_all.sh? The build on Cheyenne is working for this repository. Running the workflow may be another issue...

On Wed, Apr 22, 2020 at 10:36 AM benjamin-cash notifications@github.com wrote:

I checked out the code following the instructions in the wiki, but I was unable to build the model. I'm not 100% if that should be possible on Cheyenne at this point - if not go ahead and ignore this. The log is here: /glade/work/bcash/ufs-srweather-app/src/logs/build_forecast.log I also attempted to run the workflow independently of the build, and that failed as well with the following error message.

The system directory in which to look for the files generated by the ex- ternal model specified by EXTRN_MDL_NAME_LBCS has not been specified for this machine and external model combination: MACHINE = "CHEYENNE" EXTRN_MDL_NAME_LBCS = "FV3GFS" Exiting with nonzero status.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WNUZABBSKJ3XIUKHW5PLRN4MJ3ANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

Hi Julie - I checked out the code again and tried again with the same result, following the directions in the wiki:

(base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src> ./build_all.sh >& build.out & [1] 13224 (base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src> [1]+ Exit 2 ./build_all.sh &> build.out (base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src> ls ../exec (base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src>

JulieSchramm commented 4 years ago

I had to try twice to get all executables to build. Cheyenne is not my favorite machine.

On Wed, Apr 22, 2020 at 1:34 PM benjamin-cash notifications@github.com wrote:

Hi Julie - I checked out the code again and tried again with the same result, following the directions in the wiki:

(base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src> ./build_all.sh >& build.out & [1] 13224 (base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src> [1]+ Exit 2 ./build_all.sh &> build.out (base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src> ls ../exec (base) bcash@cheyenne3:/glade/work/bcash/ufs-srweather-app/src>

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-617986259, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WNU2MCVCCFXW5CNWRF43RN5BDRANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

Following up - I commented out build_forecast from build_all and everything else ran, so it is just that one part that is failing.

mkavulich commented 4 years ago

The difference here may be between "Build forecast" and "Build forecast_ccpp". I have only ever tried the latter on Cheyenne, it is possible the former does not work.

Michael Kavulich, Jr. Associate Scientist, National Center for Atmospheric Research (NCAR) Joint Numerical Testbed (JNT), Research Applications Laboratory (RAL) kavulich@ucar.edu kavulich@ucar.edu

On Wed, Apr 22, 2020 at 3:03 PM benjamin-cash notifications@github.com wrote:

Following up - I commented out build_forecast from build_all and everything else ran, so it is just that one part that is failing.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618039051, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADA56AR2BABCAGEHUAJIZOTRN5LQ3ANCNFSM4MOKJKCQ .

JulieSchramm commented 4 years ago

The ufs-srweather-app only uses the CCPP build when running build_forecast.sh. In this script, compile.sh is used:

./compile.sh "$FV3" "$target" "CCPP=Y STATIC=N 32BIT=Y REPRO=Y"

and the also dynamic build (STATIC=N), which is being removed this week from the ufs-mrweather-app.

Note that the srweather app uses the https:// github.com/NCAR/ufs-weather-model repository and not the ufs-community/ufs-mrweather-app repository. compile.sh will be replaced soon with ./build.sh for the ufs-srweather-app. A quick thing to try would be to set STATIC=Y and see if it builds.

On Wed, Apr 22, 2020 at 3:06 PM Michael Kavulich notifications@github.com wrote:

The difference here may be between "Build forecast" and "Build forecast_ccpp". I have only ever tried the latter on Cheyenne, it is possible the former does not work.

Michael Kavulich, Jr. Associate Scientist, National Center for Atmospheric Research (NCAR) Joint Numerical Testbed (JNT), Research Applications Laboratory (RAL) kavulich@ucar.edu kavulich@ucar.edu

On Wed, Apr 22, 2020 at 3:03 PM benjamin-cash notifications@github.com wrote:

Following up - I commented out build_forecast from build_all and everything else ran, so it is just that one part that is failing.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618039051 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADA56AR2BABCAGEHUAJIZOTRN5LQ3ANCNFSM4MOKJKCQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618040565, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WNUYTCAY3NZQOJLWXB7TRN5L5TANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

/glade/work/bcash/ufs-srweather-app/src/ufs_weather_model/NEMS/src/incmake/component_CCPP.mk:36: *** Option STATIC=Y requires suites argument as SUITES=xyz,abc,... (where suite xyz corresponds to file suite_xyz.xml). Stop.

JulieSchramm commented 4 years ago

This works:

./compile.sh "$FV3" "$target" "SUITES=FV3_GFS_2017_gfdlmp CCPP=Y STATIC=N 32BIT=Y REPRO=Y"

You need to specify a suite(s) for the static build.

On Wed, Apr 22, 2020 at 3:24 PM benjamin-cash notifications@github.com wrote:

/glade/work/bcash/ufs-srweather-app/src/ufs_weather_model/NEMS/src/incmake/component_CCPP.mk:36: *** Option STATIC=Y requires suites argument as SUITES=xyz,abc,... (where suite xyz corresponds to file suite_xyz.xml). Stop.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618048458, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WNU3XI7BD5ZG2T4JMBY3RN5OBRANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

Still no luck. I've uploaded the logs from the build here (assuming github has recovered enough to do so...)

build_forecast.log

JulieSchramm commented 4 years ago

@climbfuji Can you take a look at this build on Cheyenne? It works for me, and it seems to be failing in ccpp_prebuild.py.

climbfuji commented 4 years ago

@climbfuji Can you take a look at this build on Cheyenne? It works for me, and it seems to be failing in ccpp_prebuild.py.

I guess you are using Python 3, but a version of CCPP that doesn't support it yet? The Python 3 compatibility was added a week or so ago to the master of ccpp-framework.

benjamin-cash commented 4 years ago

I am using Python3, specifically a miniconda3 installation.

climbfuji commented 4 years ago

So, depending on which version of ccpp-framework/-physics you are using, it will not work. The UFS public release v4 works only with Python 2.7. We are planning to release an update to the medium-range weather app at about the same time as the srw app is released, which will contain the Python 3 compatibility updates.

These updates were already merged into master and the release branches: master: https://github.com/NCAR/ccpp-framework/pull/283 release branch: https://github.com/NCAR/ccpp-framework/pull/271

benjamin-cash commented 4 years ago

I'm following the instructions here: https://github.com/ufs-community/ufs-srweather-app/wiki/Getting-Started

Loading the python/2 module got me further along, but it still ended up dying with a ton of errors. New log attached. build_forecast.log

climbfuji commented 4 years ago

This is no longer my territory, as I have not been involved in any srw code development yet. But the first error I am seeing in your log is:

INFO: Logging level set to INFO
INFO: Found TYPEDEFS_NEW_METADATA dictionary in config, assume at least some data is in new metadata formet
INFO: Parsing suite definition files ...
INFO: Parsing suite definition file FV3/ccpp/suites/suite_SUITES=FV3_GFS_2017_gfdlmp.xml ...
CRITICAL: Suite definition file FV3/ccpp/suites/suite_SUITES=FV3_GFS_2017_gfdlmp.xml not found.
ERROR: Parsing suite definition file suite_SUITES=FV3_GFS_2017_gfdlmp.xml failed.

Someone from the srw developers will need to help you there, I don't know why that suite doesn't exist when you check out their code (it is the default suite in the top-level build.sh of ufs-weather-model, and it certainly exists in that repository).

Sorry that I can't be of any further help here.

benjamin-cash commented 4 years ago

Aha! The SUITES was added as an earlier attempt at a fix, so I went ahead and took that out again. Once I took that out, set STATIC back to N, and went with python2 it looks like it built correctly. I'm going to leave this open until I can confirm but that seems to have done the trick. :)

climbfuji commented 4 years ago

You need to be careful. By removing STATIC=Y you are switching to a deprecated "dynamic CCPP build" that will probably not work with the workflow. Also, it has been removed from the development branches as of today and won't be in the SRW public release. You need to figure out with the SRW workflow people why that particular suite isn't there and, if there is a good reason for it, have them change the build system to use another suite by default.

benjamin-cash commented 4 years ago

The version that I checked out had STATIC=N, so I think that this is now above my pay grade as a friendly user seeing if I can run through the checkout procedure, quick start instructions, and workflow. :)

I looked and it did in fact build the executable, but if it is uncovering issues that should be addressed I will leave this open.

llpcarson commented 4 years ago

Dom, Ben - The regional workflow currently is using the dynamic build, and will update the build and run scripts to keep up with the model code, planned soon. They are currently using an older tag (hence the python-3 issue is still there).

The SRW group is aware of the issues, and planning to update, but definitely not up-to-date with develop/master day-to-day.

Laurie

On Thu, Apr 23, 2020 at 3:47 PM benjamin-cash notifications@github.com wrote:

The version that I checked out had STATIC=N, so I think that this is now above my pay grade as a friendly user seeing if I can run through the checkout procedure, quick start instructions, and workflow. :)

I looked and it did in fact build the executable, but if it is uncovering issues that should be addressed I will leave this open.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618689180, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWIV2HP5AHUV3AMTFRULROCZP5ANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

Hi Laurie - Thanks! I will pause trying to test the workflow then until this build issue and the cheyenne configuration issue in the workflow scripts that I mention in the first post are addressed.

llpcarson commented 4 years ago

If you use a Python 2.7 version, and the tag in the ufs-srweather-app Externals.cfg, then the model (and other components) should compile OK now. But, there will be some updates coming for these soon, to stay up-to-date with on-going development.

On Thu, Apr 23, 2020 at 3:56 PM benjamin-cash notifications@github.com wrote:

Hi Laurie - Thanks! I will pause trying to test the workflow then until this build issue and the cheyenne configuration issue in the workflow scripts that I mention in the first post are addressed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618692508, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWIWMBKTP6EVDVDB64K3ROC2R7ANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

Good to know.

FWIW, the 'generate the workflow step' described in the wiki still fails, so that needs to be updated for Cheyenne. Specifically there is no CHEYENNE case for EXTRN_MDL_FILES_SYSBASEDIR_LBCS for FV3GFS in set_extrn_mdl_params.sh

benjamin-cash commented 4 years ago

Update - I added in the necessary line myself, and was able to get further along before I ran into a different issue that I don't think I will be able to solve:

Error message from "cd_vrfy" function's "cd" operation: /glade/work/bcash/ufs-srweather-app/regional_workflow/ush/bash_utils/filesys_cmds_vrfy.sh: line 119: cd: /glade/work/bcash/ufs-srweather-app/regional_workflow/modulefiles/tasks/cheyenne: No such file or directory

I looked in this directory and their are subdirectories for hera and jet, but not cheyenne. So I think I am stuck there until those files can be supplied.

JulieSchramm commented 4 years ago

I have updated the regional_workflow repository of the ufs-srweather-app and also the wiki page with modifications to run on cheyenne. Can you re-clone and try again?

On Thu, Apr 23, 2020 at 4:25 PM benjamin-cash notifications@github.com wrote:

Update - I added in the necessary line myself, and was able to get further along before I ran into a different issue that I don't think I will be able to solve:

Error message from "cd_vrfy" function's "cd" operation: /glade/work/bcash/ufs-srweather-app/regional_workflow/ush/bash_utils/filesys_cmds_vrfy.sh: line 119: cd: /glade/work/bcash/ufs-srweather-app/regional_workflow/modulefiles/tasks/cheyenne: No such file or directory

I looked in this directory and their are subdirectories for hera and jet, but not cheyenne. So I think I am stuck there until those files can be supplied.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-srweather-app/issues/3#issuecomment-618702960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WNU42NH2BIKEJZ4VZPTTROC55TANCNFSM4MOKJKCQ .

benjamin-cash commented 4 years ago

Testing now...

Looks like I was able to build everything successfully and run the workflow!

benjamin-cash commented 4 years ago

Finally got back to this and everything ran to completion - closing.