NOAA-EMC / GLDAS

0 stars 4 forks source link

Port the build scripts to S4 #23

Closed DavidHuber-NOAA closed 3 years ago

DavidHuber-NOAA commented 3 years ago

The build scripts have been ported to S4 to support the global workflow. Note that S4 consists of two partitions: s4 and ivy, with the latter using an older architecture (Ivy Bridge). GLDAS needs to be built so it can run on either partition, so the older architecture instructions are sent to the compiler.

Two test cases have been run: one at 192/96 from 2020073118 through 2020081000 and the other at 384/192 from 2020073118 through 2020080400. The logs and data have been pushed to Orion for review as detailed below. 192/96 Logs: /work/noaa/nesdis-rdo2/dhuber/s4_logs/test_dev Comrot: /work/noaa/nesdis-rdo2/dhuber/s4_data/comrot/test_dev Archive: /work/noaa/nesdis-rdo2/dhuber/S4_data/archive/test_dev

384/192 Logs: /work/noaa/nesdis-rdo2/dhuber/s4_logs/test_dev_384 Comrot (2020080400 only): /work/noaa/nesdis-rdo2/dhuber/s4_data/comrot/test_dev_384 Archive: /work/noaa/nesdis-rdo2/dhuber/S4_data/archive/test_dev_384

I expect the 384/192 comrot directory to finish transferring tomorrow (8/11).

Fixes #21

DavidHuber-NOAA commented 3 years ago

A note on the S4 build script for gdas2gldas. The global workflow loads a runtime environment module intended to be common among all of the components. Among the modules loaded is ESMF version 8.1.1. However, gdas2gldas appears to be built with 8.1.0 on most systems. For S4, this was causing a crash. Building against version 8.1.1 resolved that crash, which is why it was selected here. For more details, see this thread. FYI @KateFriedman-NOAA

DavidHuber-NOAA commented 3 years ago

@HelinWei-NOAA Checking in on this PR, I noticed I have a conflict due to some changes to sorc/machine-setup.sh. Looking through the changes made, I'm a little confused by their contents. It looks like in addition to adding WCOSS2, the script adds $NCEPLIBS to the module search path for each machine. For instance, on Hera:

elif [[ -d /scratch1 ]] ; then
    # We are on NOAA Hera
    if ( ! eval module help > /dev/null 2>&1 ) ; then
    echo load the module command 1>&2
        source /apps/lmod/lmod/init/$__ms_shell
    fi
    target=hera
    module purge
    module load intel
    module load impi
    export NCEPLIBS=/scratch2/NCEPDEV/nwprod/NCEPLIBS
    module use $NCEPLIBS/modulefiles
    #export WRFPATH=$NCEPLIBS/wrf.shared.new/v1.1.1/src
    export myFC=mpiifort
    export FCOMP=mpiifort

However, as far as I can tell, none of the modules loaded in the build scripts are an NCEP library but instead hpc-stack. If one or more of the NCEPLIBS are required for GLDAS, please let me know and I will get it installed on S4 and update the entry in machine_setup.sh.

Also, for S4, I had to load ESMF version 8.1.1 to successfully run GLDAS via the global workflow. I mentioned this in the comment above. I suspect that it may be an issue on other machines as well, though I have not tested it. If you need more clarification on this issue, please let me know.

If you need any more information or data, please let me know. Otherwise, would it be possible to move on with this PR?

KateFriedman-NOAA commented 3 years ago

export NCEPLIBS=/scratch2/NCEPDEV/nwprod/NCEPLIBS module use $NCEPLIBS/modulefiles

@HelinWei-NOAA Why is this there for some platforms in machine-setup.sh still? GLDAS should be hpc-stack everywhere now so you shouldn't need this exported since loading the initial hpc-stack modules will grant access to the full hpc-stack libraries/modules. It should be similar to how you're handling WCOSS-Dell in machine-setup.sh.

none of the modules loaded in the build scripts are an NCEP library but instead hpc-stack.

@DavidHuber-NOAA The hpc-stack is a stack of NCEPLIBS, so if you have hpc-stack on S4 then you have access to the NCEPLIBS through hpc-stack. I don't think Helin needs that $NCEPLIBS export and the line after it.

HelinWei-NOAA commented 3 years ago

@KateFriedman-NOAA @DavidHuber-NOAA Okay. I will remove all lines related to NCEPLIBS in machine-setup.sh

HelinWei-NOAA commented 3 years ago

All the current issues have been solved. I created a new tag for the update. Thanks David and Kate. Sorry for the delay. I was very busy in the evaluating UFS prototype runs in the last couple weeks.

DavidHuber-NOAA commented 3 years ago

Thank you @HelinWei-NOAA!

DavidHuber-NOAA commented 3 years ago

@HelinWei-NOAA It looks like the machine-setup.sh script has had the S4 changes stripped out. Can this PR be reopened to fix that or should I open another?