NOAA-EMC / global-workflow

Global Superstructure/Workflow supporting the Global Forecast System (GFS)
https://global-workflow.readthedocs.io/en/latest
GNU Lesser General Public License v3.0
74 stars 164 forks source link

Standard environment variables should not be set below the j-job level #293

Open WalterKolczynski-NOAA opened 3 years ago

WalterKolczynski-NOAA commented 3 years ago

Wojciech Cencek 2019-06-18 02:00:04 UTC Standard environment variables (WCOSS Implementation Standards, Table 1) must not be set or changed below the j-job level, for instance in scripts/exglobal_enkf_post_fv3gfs.sh.ecf:

HOMEgsi=${HOMEgsi:-$NWPROD} FIXgsi=${FIXgsi:-$HOMEgsi/fix} DATA=${DATA:-$pwd} COMIN=${COMIN:-$pwd} COMOUT=${COMOUT:-$COMIN}

Please remove all such lines from files in ./scripts and ./ush directories, even if they have no effect when the default is established in the j-job.

Refs: Bugzilla #1212

WalterKolczynski-NOAA commented 3 years ago

I'm checking to confirm if this is still an issue with the code delivered for v16

WalterKolczynski-NOAA commented 3 years ago

It is still an issue:

Jen Yang 2021-03-12 04:15:02 UTC The problems listed in "Bug 892 - Standard environment variables set below the j-job level" is not completely fixed in gfs.v16. Here I open a new ticket to summarize all the issues related to Standard environment variables setting in gfs package.

  1. As mentioned in bug 892: There are Standard environment variables defined in the scripts and ush scripts. Some of them are already defined in upstream (job scripts or config files), but the definition in ex-scripts and ush scripts can be cleaned up.

You can get a roughly list by "grep HOME *|grep =" in scripts/ and ush/ directory:

Here is some example: scripts: exemcsfc_global_sfc_prep.sh:export HOMEgfs=${HOMEgfs:-$NWROOT/gfs.${gfs_ver:?}} exemcsfc_global_sfc_prep.sh:export USHgfs=${USHgfs:-$HOMEgfs/ush} exgdas_atmos_chgres_forenkf.sh:export FIXgsm=${FIXgsm:-$HOMEgfs/fix/fix_am} exgdas_atmos_gldas.sh:export FIXgldas=${FIXgldas:-$HOMEgfs/fix}

exgdas_atmos_verfozn.sh:export HOMEgdas_ozn=${HOMEgdas_ozn:-${NWROOT}/gdas.${gdas_oznmon_ver}} export HOMEoznmon=${HOMEoznmon:-/${NWROOT}/oznmon_shared.v${shared_oznmon_ver}} export EXECoznmon=${EXECoznmon:-$HOMEoznmon/exec} export COM_IN=${COM_IN:-${COMROOT}/${NET}/${envir}} export COMIN=${COMIN:-$COM_IN/${RUN}.${PDY}/${cyc}/$COMPONENT}

exgfs_wave_nawips.sh:export GEMwave=${GEMwave:-$HOMEgfs/gempak} exgfs_wave_prdgen_bulls.sh: export EXECwave=${EXECwave:-$HOMEgfs/exec} exglobal_forecast.sh: FCSTEXECDIR=${FCSTEXECDIR:-$HOMEgfs/sorc/fv3gfs.fd/NEMS/exe}

ush: global_cycle.sh:HOMEgfs=${HOMEgfs:-$BASEDIR/gfs_ver.${gfs_ver}} syndat_qctropcy.sh:HOMENHC=${HOMENHC:-/gpfs/dell2/nhc/save/guidance/storm-data/ncep} syndat_qctropcy.sh:FIXSYND=${FIXSYND:-$HOMEgfs/fix/fix_am} tropcy_relocate.sh: HOMEALL=${HOMEALL:-/disk1/users/snake/prepobs} tropcy_relocate.sh:HOMERELO=${HOMERELO:-${shared_global_home}}

  1. Should consolidate the HOMEGlobal, HOMEwave, HOMEgsm and HOMERELO, etc to HOMEgfs, as it is in the same gfs package now. the model is gfs. Same as USH$model, EXEC$model, FIX$model...

  2. in most of the job scripts, it calls one or more config files. The Standard environment variables including COMIN/COMOUT are defined in the config files. They should be defined in job script level. Defining in config files create more difficulty during the production debug or com path update. Plus the config files are usally shared by a couple jobs. If EMC wants to use config files to set the variables for development run, then you can set a block in config file for EMC enviroment use only, for production, the variables should not be set there. and always set in the job script or job card following the implementation standard table 1.

  3. The $ROTDIR (ROTDIR=${COMROOT:?}/$NET/$envir) is sometime set in config files and some time set in job script, then the COMIN and COMOUT is defined through ROTDIR. It should get rid of the $ROTDIR and use "${COMROOT:?}/$NET/$envir" directly, this way also bring the flexibilty to change $envir for COMIN/COMOUT if neccessary.

  4. Please use COMIN and COMOUT in each jobs to separate the files/directory for which is input and which is output, that way it is more easy to test the job by itself. For example: in exglobal_forecast.sh "memdir=$ROTDIR/${prefix}.$PDY/$cyc/atmos/$memchar gmemdir=$ROTDIR/${rprefix}.$gPDY/$gcyc/atmos/$memchar increment_file=$memdir/${CDUMP}.t${cyc}z.${PREFIX_ATMINC}atminc.nc increment_file=$memdir/${CDUMP}.t${cyc}z.${PREFIX_ATMINC}atmi${incfhr}. " hard to tell the memdir is for COMIN or COMOUT from those lines.

The above issues can be improved in the next major upgrade.

WalterKolczynski-NOAA commented 3 years ago

We may want to discuss some of these with NCO. Defining default values in a lower-level script should be allowed as long as existing values are not overwritten. Otherwise it can mess with compartmentalization, especially if the script is shared between multiple systems.

WalterKolczynski-NOAA commented 3 years ago

Yet more:

Jen Yang 2021-03-12 07:14:58 UTC It is required that hardcoded paths needed to remove from ex-scripts, ush scripts, parm files, fix files, etc. except the J-job and initialization files?

In gfs.v16.0, the following files has hardcode modules:

exgdas_atmos_gldas.sh:export FINDDATE=${FINDDATE:-/gpfs/dell1/nco/ops/nwprod/prod_util.v1.1.4/ush/finddate.sh} exgdas_atmos_gldas.sh:export utilexec=${utilexec:-/gpfs/dell1/nco/ops/nwprod/grib_util.v1.1.0/exec}

The following three still refer to phase1/2 location for default value. exgdas_atmos_verfozn.sh:export NDATE=${NDATE:-/nwprod/util/exec/ndate} exgdas_atmos_verfrad.sh:export NDATE=${NDATE:-/nwprod/util/exec/ndate} exgdas_atmos_vminmon.sh:export NDATE=${NDATE:-/nwprod/util/exec/ndate} tropcy_relocate.sh:export NWROOT=${NWROOT:-/nwprod2} fv3gfs_downstream_nems.sh:export CNVGRIB=${CNVGRIB:-${NWPROD:-/nwprod}/util/exec/cnvgrib21} fv3gfs_downstream_nems.sh:export COPYGB2=${COPYGB2:-${NWPROD:-/nwprod}/util/exec/copygb2} mod_icec.sh:export WGRIB2=${WGRIB2:-${NWPROD:-/nwprod}/util/exec/wgrib2}

Please fix/clean up the above by next major upgrade.

yangfanglin commented 3 years ago

I had discussions with NCO before on this requirement and emphasized that we need to keep these local definitions of directory paths for running the same system on other platforms for RD. NCO seems to be not fully appreciating the RD requirements. Yes, this needs to be better communicated with NCO. -- Fanglin

On Fri, Mar 12, 2021 at 12:00 PM Walter Kolczynski - NOAA < @.***> wrote:

Yet more:

Jen Yang 2021-03-12 07:14:58 UTC It is required that hardcoded paths needed to remove from ex-scripts, ush scripts, parm files, fix files, etc. except the J-job and initialization files?

In gfs.v16.0, the following files has hardcode modules:

exgdas_atmos_gldas.sh:export FINDDATE=${FINDDATE:-/gpfs/dell1/nco/ops/nwprod/prod_util.v1.1.4/ush/finddate.sh} exgdas_atmos_gldas.sh:export utilexec=${utilexec:-/gpfs/dell1/nco/ops/nwprod/grib_util.v1.1.0/exec}

The following three still refer to phase1/2 location for default value. exgdas_atmos_verfozn.sh:export NDATE=${NDATE:-/nwprod/util/exec/ndate} exgdas_atmos_verfrad.sh:export NDATE=${NDATE:-/nwprod/util/exec/ndate} exgdas_atmos_vminmon.sh:export NDATE=${NDATE:-/nwprod/util/exec/ndate} tropcy_relocate.sh:export NWROOT=${NWROOT:-/nwprod2} fv3gfs_downstream_nems.sh:export CNVGRIB=${CNVGRIB:-${NWPROD:-/nwprod}/util/exec/cnvgrib21} fv3gfs_downstream_nems.sh:export COPYGB2=${COPYGB2:-${NWPROD:-/nwprod}/util/exec/copygb2} mod_icec.sh:export WGRIB2=${WGRIB2:-${NWPROD:-/nwprod}/util/exec/wgrib2}

Please fix/clean up the above by next major upgrade.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/global-workflow/issues/293#issuecomment-797623060, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKY5N2NYRR45RAAGMTG43IDTDI3CDANCNFSM4ZBNKZXA .

-- Fanglin Yang, Ph.D. Chief, Model Physics Group Modeling and Data Assimilation Branch

NOAA/NWS/NCEP Environmental Modeling Center

https://www.emc.ncep.noaa.gov/gmb/wx24fy/fyang/ https://www.emc.ncep.noaa.gov/gmb/wx24fy/fyang/

emilyhcliu commented 2 years ago

@WalterKolczynski-NOAA @yangfanglin @RussTreadon-NOAA We need to resolve this ticket for the v16.3.0 code delivery this Friday. Are we at the point to mark this ticket as resolved?

WalterKolczynski-NOAA commented 2 years ago

This is really a question for @KateFriedman-NOAA, since she has been the one handling all the v16 handoffs. I haven't touched any v16 code, which is now substantially different than what is in develop. Unfortunately, she is on leave until Thursday.

Further, I don't believe the issues raised above have been resolved, and I don't see them being sorted out in just a few days.

aerorahul commented 2 years ago

@emilyhcliu These issues have not been worked on and will not be in time for Friday's code delivery. While I agree these may be "bugs" in the eyes of NCO, they are necessary to allow testing and execution on R&D platforms. Having said that, their definition and use can be improved, something that we don't allocate time or resources and expect they will fix themself. Since v16.3.0 is an upgrade in the same family as v16, we can at best resolve to address some for the next major upgrade of v17.

WalterKolczynski-NOAA commented 1 year ago

For all these variables that shouldn't be modified, we should make them read-only in the j-job as an additional safeguard against being modified.