NOAA-EMC / global-workflow

Global Superstructure/Workflow supporting the Global Forecast System (GFS)
https://global-workflow.readthedocs.io/en/latest
GNU Lesser General Public License v3.0
74 stars 165 forks source link

v16.3 DA pre-implementation parallel #776

Closed lgannoaa closed 2 years ago

lgannoaa commented 2 years ago

Description

This issue is to document the v16.3 da pre-implementation parallel.

Initial tasking email sent by @arunchawla-NOAA indicated: Emily and Andrew summarized on May 9th, 2022:

  1. All v16.3 implementation code, script and fix file update have been merged into GSI master except the IR bug fix from Jim and Andrew (simple changes and it is in PR already)
  2. The WCOSS-2 porting branch was merged into GSI master about 3 hours ago!
  3. We will ask Mike to create a gfsda.v16.3 as soon as (1) is merged --- this should be done by the end of this week

First full cycle starting CDATE is retro 2021101600 HOMEgfs: /lfs/h2/emc/global/noscrub/lin.gan/git/gfsda.v16.3.0 pslot: da-dev16-ecf EXPDIR: /lfs/h2/emc/global/noscrub/lin.gan/git/gfsda.v16.3.0/parm/config COM: /lfs/h2/emc/ptmp/Lin.Gan/da-dev16-ecf/para/com/gfs/v16.3 log: /lfs/h2/emc/ptmp/Lin.Gan/da-dev16-ecf/para/com/output/prod/today on-line archive: /lfs/h2/emc/global/noscrub/lin.gan/archive/da-dev16-ecf METPlus stat files: /lfs/h2/emc/global/noscrub/lin.gan/archive/metplus_data FIT2OBS: /lfs/h2/emc/global/noscrub/lin.gan/archive/da-dev16-ecf/fits Verification Web site: https://www.emc.ncep.noaa.gov/gmb/Lin.Gan/metplus/da-dev16-ecf (Updated daily at 14:00 UTC on PDY-1) HPSS archive: /NCEPDEV/emc-global/5year/lin.gan/WCOSS2/scratch/da-dev16-ecf

FIT2OBS: /lfs/h2/emc/global/save/emc.global/git/Fit2Obs/newm.1.5 df1827cb (HEAD, tag: newm.1.5, origin/newmaster, origin/HEAD)

obsproc: /lfs/h2/emc/global/save/emc.global/git/obsproc/v1.0.2 83992615 (HEAD, tag: OT.obsproc.v1.0.2_20220628, origin/develop, origin/HEAD)

prepobs /lfs/h2/emc/global/save/emc.global/git/prepobs/v1.0.1 5d0b36fba (HEAD, tag: OT.prepobs.v1.0.1_20220628, origin/develop, origin/HEAD)

HOMEMET /apps/ops/para/libs/intel/19.1.3.304/met/9.1.3

METplus /apps/ops/para/libs/intel/19.1.3.304/metplus/3.1.1

verif_global /lfs/h2/emc/global/noscrub/lin.gan/para/packages/gfs.v16.3.0/sorc/verif-global.fd 1aabae3aa (HEAD, tag: verif_global_v2.9.4)

Requirements

A meeting has been setup to discuss what is the action summary for package preparation.

Acceptance Criteria (Definition of Done)

Dependencies

lgannoaa commented 2 years ago

A meeting hosted by @aerorahul joined by @arunchawla-NOAA on May 10th outlined the following action regarding this issue:

  1. @KateFriedman-NOAA will arrange dump archive transfer from wcoss1 to wcoss2 and enable ability for parallel running in retro mode (such making change to switch input from EMC dump archive or NCO realtime).
  2. @KateFriedman-NOAA will try to setup and test end-to-end retro parallel using rocoto workflow on wcoss2.
  3. Lin will test retro mode in ecflow using dump archive transfer after step 1. above is completed.
  4. We will get some more information from DA on a scheduled meeting May 11th for the science changing detail of the DA package.
lgannoaa commented 2 years ago
lgannoaa commented 2 years ago

Setting DA retro ecflow workflow parallel on WCOSS2 started on 5/12. This activity has a bit delay from 5/12 to 5/23 due to WCOSS2 RFC, switch, system issue, ETC. A few issues has been resolved or workaround provided:

lgannoaa commented 2 years ago

The new GSI package as of June 1st is 047b5da submodule 99f147c. This version has three file and one directory changed in build process. global_gsi.x ---:> gsi.x global_enkf.x --- enkf.x ncdiag_cat.x ---> nc_diag_cat.x $LINK ../sorc/gsi.fd/exec/$gsiexe . ---> $LINK ../sorc/gsi.fd/install/bin/$gsiexe . The above GSI package and modify is obsoleted.

As of June 10th we have the following on Dogwood for gsi: release/gfsda.v16.3.0 at 42cc83 git submodule 99f147cc7a55032b329bcd4f738cabd28b129295 fix (fv3da.v1.0.1-159-g99f147c)

MichaelLueken commented 2 years ago

The new GSI package as of June 1st is 047b5da submodule 99f147c. This version has three file and one directory changed in build process. global_gsi.x ---:> gsi.x global_enkf.x --- enkf.x ncdiag_cat.x ---> nc_diag_cat.x $LINK ../sorc/gsi.fd/exec/$gsiexe . ---> $LINK ../sorc/gsi.fd/install/bin/$gsiexe .

Hi @lgannoaa - There is one minor correction here - instead of ncdiag_cat.x ---> nc_diag_cat.x it should be ncdiag_cat.x ---> ncdiag_cat_serial.x

Additionally, given changes in the develop branch that have never made it into the release branch for GFS DA components previously, the .v1.0.0, .v2.0.0, and .v3.0.0 entries for Minimization_Monitor, Ozone_Monitor, and Radiance_Monitor, respectively, need to be removed.

Also, since fv3gfs_ncio has been removed from the GSI project, the ncio stack module will need to be added to versions/build.ver:

export ncio_ver=1.0.0

and ncio will need to be added to modulefiles/fv3gfs/enkf_chgres_recenter_nc.wcoss2.lua:

load(pathJoin("ncio", os.getenv("ncio_ver")))

Using the stack-built ncio module will also require changes to:

sorc/build_enkf_chgres_recenter_nc.sh - removal of the following lines:

export FV3GFS_NCIO_LIB="${cwd}/gsi.fd/build/lib/libfv3gfs_ncio.a"
export FV3GFS_NCIO_INC="${cwd}/gsi.fd/build/include"

if [ ! -f $FV3GFS_NCIO_LIB ]; then
  echo "BUILD ERROR: missing GSI library file"
  echo "Missing file: $FV3GFS_NCIO_LIB"
  echo "Please build the GSI first (build_gsi.sh)"
  echo "EXITING..."
  exit 1
fi

sorc/enkf_chgres_recenter_nc.fd/input_data.f90 - replace module_fv3gfs_ncio with module_ncio

sorc/enkf_chgres_recenter_nc.fd/output_data.f90 - replace module_fv3gfs_ncio with module_ncio

sorc/enkf_chgres_recenter_nc.fd/makefile - replace FV3GFS_NCIO_INC entries with NCIO_INC

Finally, for building the GSI, I'd recommend the following changes in sorc/build_gsi.sh:

export GSI_MODE="GFS" export UTIL_OPTS="-DBUILD_UTIL_ENKF_GFS=ON -DBUILD_UTIL_MON=ON -DBUILD_UTIL_NCIO=ON"

This will build the GSI in global mode (the default is regional - adding WRF to the build) and will limit the building of utilities from all utilities (by default) to just those utilities required within the GFS.

MichaelLueken commented 2 years ago

@lgannoaa I missed an update that is required for making sorc/enkf_chgres_recenter_nc.fd/makefile work with the stack's ncio module:

Replacing FV3GFS_NCIO_LIB with NCIO_LIB.

Many thanks to @RussTreadon-NOAA for bringing this to my attention.

MichaelLueken commented 2 years ago

Hi @lgannoaa @KateFriedman-NOAA @emilyhcliu, I wanted to ask a question about how you would like to proceed with respect to the renaming of the gsi, enkf, and ncdiag_cat executables. Should I make changes to the GSI/scripts to use the new executables, or will sorc/link_fv3gfs.sh be updated to link these new executable names with the old naming convention? Please let me know your preference so that I can make the necessary changes for release/gfsda.v16.3.0.

KateFriedman-NOAA commented 2 years ago

I wanted to ask a question about how you would like to proceed with respect to the renaming of the gsi, enkf, and ncdiag_cat executables. Should I make changes to the GSI/scripts to use the new executables, or will sorc/link_fv3gfs.sh be updated to link these new executable names with the old naming convention? Please let me know your preference so that I can make the necessary changes for release/gfsda.v16.3.0.

@MichaelLueken-NOAA Let's move this discussion back to the main issue for this upgrade: issue #744. This issue is just for documenting the parallel. @lgannoaa Please keep workflow changes and non-parallel setup discussions in issue #744. Thanks!

@MichaelLueken-NOAA Please summarize the GSI executable name changes that are occurring in a new comment in #744 and then tag Emily, Rahul, Lin, and myself to discuss. Thanks!

lgannoaa commented 2 years ago

@arunchawla-NOAA @aerorahul @emilyhcliu @MichaelLueken-NOAA @RussTreadon-NOAA @KateFriedman-NOAA This parallel is setup on Dogwood. Please note the following configuration information. I will have a meeting with Emily to review the package before startup the run. This meeting is scheduled on June 10th.

HOMEgfs: /lfs/h2/emc/global/noscrub/lin.gan/git/gfsda.v16.3.0 pslot: da-dev16-ecf EXPDIR: /lfs/h2/emc/global/noscrub/lin.gan/git/gfsda.v16.3.0/parm/config COM: /lfs/h2/emc/ptmp/Lin.Gan/da-dev16-ecf/para/com/gfs/v16.3 log: /lfs/h2/emc/ptmp/Lin.Gan/da-dev16-ecf/para/com/output/prod/today

lgannoaa commented 2 years ago

A meeting with Emily on June 10th outline the following action:

yangfanglin commented 2 years ago

@emilyhcliu Emily, could you please check with Jun Wang and Helin Wei to make sure the updated forecast model is used in this cycled experiment ? Model updates include changes in LSM for improving snow forecast and in UPP for fixing cloud ceiling calculation. (@junwang-noaa @HelinWei-NOAA @WenMeng-NOAA )

KateFriedman-NOAA commented 2 years ago

@lgannoaa I'm working with Helin to get a new GLDAS tag ready (which includes a needed small update related to adding the "atmos" subfolder into the GDA, the PRCP CPC gauge file path needed to add it too). The current GLDAS tag in the release/gfs.v16.3.0 branch will work for GDA dates prior to WCOSS2 go-live but not for dates after go-live (when the new "atmos" subfolder is added).

I'm also working to wrap up Fit2Obs testing and try to get a new tag for your use on WCOSS2 ASAP.

junwang-noaa commented 2 years ago

@HelinWei-NOAA I do not see the snow updates PR to the ufs-weather-model production/GFS.v16 branch. Would you please make one if the code updates are ready? Thanks

HelinWei-NOAA commented 2 years ago

@junwang-noaa I created one on fv3atm

lgannoaa commented 2 years ago

HOMEgfs/parm/config config.resources.nco.static and config.fv3.nco.static have been used to fix eupd job card issue that caused it to fail. The Global Workflow emc.dyn versions of those configs aren't yet updated to run high res on WCOSS2.

lgannoaa commented 2 years ago

@emilyhcliu Please review the first three full cycles of the output from this parallel. There were 7 and half cycles of a testing run available for review. It is located in:/lfs/h2/emc/ptmp/lin.gan/da-dev16-ecf/para/com/gfs/v16.3 The first half cycle is 20211015 18Z, and follow by 7 completed cycles.

lgannoaa commented 2 years ago

A meeting with @emilyhcliu on June 14th. indicated there will be a few more short cycled tests required for package adjustment before start running the official implementation parallel.
This meeting also outlined the following action items need to happen before the next run of cycled testing.

  1. CRTM v.2.4.0 - They are waiting for NCO to approve and install new version of CRTM v.2.4.0 on wcoss2. This version is required to start the official parallel.
  2. Namelist change - Namelist change in configuration file is required before official parallel can start.
  3. Scripts change - The GSI branch will have more update/commit prior to starting the official parallel.
  4. Modify three files ICS - There are three files in initial conditions need to be modified for this parallel.
lgannoaa commented 2 years ago

A meeting with @emilyhcliu @aerorahul @KateFriedman-NOAA on June 15th outlined the following: A new job gdas/gfs prep will be installed in ecflow similar the the one running in realtime rocoto workflow. To run EMC gdas/gfs prep jobs to create prepbufr files. ecflow workflow will run internal MetPlus and gplots jobs for create verification Web page. Emily will check with DA team on UPP, UFS model, and VPPGB points of contact, and how many post hour output are needed for this parallel. Also if there is any external partners or downstream users be evaluating this parallel. Change regarding wired TCVITL path assignment will be made to use TCVITL file from EMC dump archive for this parallel.

lgannoaa commented 2 years ago

Information received from Daryl on June 16th outline the following:

  1. Helin Wei is the POC for physics changes added to V16.3. We do not need hourly output but would like to have 3-hourly 1-deg grib2 files from the 00Z cycle for an extended period of time for evaluation.
  2. Hui-Ya and Yali Mao are the UPP contacts for GFS v16.3. Yali is making several changes to WAFS aviation products, so I assume we'll need hourly data, but I'm not sure for how long.
  3. I think there is still an open question regarding stakeholder engagement and evaluation.
lgannoaa commented 2 years ago

Ali Abdolali is now assigned as WAVE point of contact.

lgannoaa commented 2 years ago

As of noon on June 24th, here is the state of this parallel:

lgannoaa commented 2 years ago

A meeting with CYCLONE_TRACKER code manager Jiayi on June 24th. Checked Jobs and output in $COM/$RUN.$PDY/$cyc/atmos/epac and natl. These jobs run successfully.

lgannoaa commented 2 years ago

Emily checked new gdas/gfs prep jobs and jobs using its output on June 24th. It is working.

lgannoaa commented 2 years ago

As of EOB June 28th, there are three incoming changes:

A new run of cycled test started on June 29 to test a few days with DELETE_COM_IN_ARCHIVE_JOB="YES". This test will be used to review EMC developer verification jobs, FIT2OBS , MetPlus, gplots, and EMC para-check jobs.

lgannoaa commented 2 years ago

Switch verif-global.fd to verif_global_v2.9.5 tag on July 5th. Add METplus off-line driver into HOMEgfs/ecf/scripts/workflow_manager/scripts

lgannoaa commented 2 years ago

Meeting with code manager for verif-global and FIT2OBS on July 5th indicated no issue found in a 10 days testing run.

emilyhcliu commented 2 years ago

Status of DA package

lgannoaa commented 2 years ago

A single cycle testing with GFS downstream, GFS gempak, and GDAS gempak was completed on July 6th. Code manager certified jobs are working. WAFS is still under debugging. Found silent fail in post where it failed to generate files required for running wafs.

lgannoaa commented 2 years ago

A confirmed bug for CRTM 2.4.0 has been reported. This bug also existing in current operational GFS using CRTM 2.3.0 Fix file mhs_metop-c.SpcCoeff.bin size is incorrect. This has result changing impacts. DA team noted this issue in: https://github.com/JCSDA-internal/crtm/issues/338 A workaround for this ecflow parallel has been done by using a local copy of the fix file until this issue is fixed.

lgannoaa commented 2 years ago

July 8th 11:20a EST. The parallel is now started.

lgannoaa commented 2 years ago

There is a file in GSI tag need to be modified. Local fix is in place to keep ecflow workflow running. GSI tag will need to be created.

lgannoaa commented 2 years ago

gfs/gdas downstream and gempak jobs are not been run at this time. The post has been discovered to have silent failure when generate files for wafs. Management has made a decision to not debug post/wafs until Yali is back from vacation.

lgannoaa commented 2 years ago

ecflow workflow have been modified to wire in crtm 2.4.0 for job eobs, gdas/gfs analysis where using GSI with crtm module load. This is to ensure the runtime library is correct for those jobs. The rest of the workflow stay with crtm 2.3.0 for runtime module loading.

lgannoaa commented 2 years ago

The HPSS failed over night on July 8th causing parallel on hold. Rerun archive jobs.

lgannoaa commented 2 years ago

FIT2OBS package in emc.global is missing. Requested help from helpdesk to transfer it from Dogwood to Cactus. The location is: /lfs/h2/emc/global/save/emc.global/git/Fit2Obs/newm.1.5 This issue impact gdas verification job which is unable to run FIT2OBS package. @arunchawla-NOAA @dtkleist @aerorahul @KateFriedman-NOAA

RussTreadon-NOAA commented 2 years ago

@lgannoaa , Dogwood /lfs/h2/emc/global/save/emc.global/git/Fit2Obs/newm.1.5 has been rsync'd to Cactus /lfs/h2/emc/global/save/emc.global/git/Fit2Obs/newm.1.5. You may close your helpdesk ticket.

Add @arunchawla-NOAA , @dtkleist , @aerorahul , and @KateFriedman-NOAA for awareness.

lgannoaa commented 2 years ago

@emilyhcliu Please take a look at completed cycle. As discussed, you will check the parallel output to be sure modified initial condition and GSI change is there after three completed cycle. I am waiting for your feedback to turn on COM auto clean function in archive jobs which will remove old files from previous run to save disk space. The PTMP is now at 45TB usage with only 66% on group quota. It is not at critical level.

@emilyhcliu The PTMP usage is now 71%. I have seem some system issues with ecflow connection failed causing jobs into zombie state. I plan to turn on auto COM cleanup in archive job by midnight on July 9th.

DELETE_COM_IN_ARCHIVE_JOB="YES" is in place on 2021101718 GFS archive job

lgannoaa commented 2 years ago

Starting 20211021 00Z the following jobs are under memory adjustment: All EMC archive jobs, enkfgdas_diag, gdas_atmos_verfrad, gdas_atmos_emcsfc_sfc_prep, gfs_atmos_emcsfc_sfc_prep, gfs_wave_prdgen_gridded, gfs_wave_prep.

emilyhcliu commented 2 years ago

The gfs.v16.3.0 parallel starts from 20211015 18z and is now up to 20211020 12Z. For sanity check, I looked at the statistics of the difference between 6-hour forecast and collocated radiosonde data (O-F) and compared them to those from the operational run.
Please see the plots in the slides.

The parallel run is still spinning up and the data samples for O-F statistics are still small. This is just a sanity check.

HelinWei-NOAA commented 2 years ago

@emilyhcliu Do you have the corresponding operational GFS results on the disk? I would like to make some snow depth plots to see the impact of our fix. If no, I can grap them from the HPSS. Thanks.

emilyhcliu commented 2 years ago

@HelinWei-NOAA I only pull data (statistics files) related to DA on the disk. I did not get forecast related files to disk.

HelinWei-NOAA commented 2 years ago

@emilyhcliu Thanks. Where can I find the latest test results? /lfs/h2/emc/ptmp/lin.gan/da-dev16-ecf/para/com/gfs/v16.3 only contains some old tests. @HelinWei-NOAA This parallel started on July 8th 2022 (CDATE=2021101600). It is currently still on retro mode (PDY=20211022).

emilyhcliu commented 2 years ago

@emilyhcliu Thanks. Where can I find the latest test results? /lfs/h2/emc/ptmp/lin.gan/da-dev16-ecf/para/com/gfs/v16.3 only contains some old tests. @HelinWei-NOAA This parallel started on July 8th 2022 (CDATE=2021101600). It is currently still on retro mode (PDY=20211022).

@HelinWei-NOAA Here is the on-line archive on WCOSS-2: /lfs/h2/emc/global/noscrub/lin.gan/archive/da-dev16-ecf

I will find the location of HPSS archive and let you know.

emilyhcliu commented 2 years ago

@emilyhcliu Thanks. Where can I find the latest test results? /lfs/h2/emc/ptmp/lin.gan/da-dev16-ecf/para/com/gfs/v16.3 only contains some old tests. @HelinWei-NOAA This parallel started on July 8th 2022 (CDATE=2021101600). It is currently still on retro mode (PDY=20211022).

@HelinWei-NOAA Here is the on-line archive on WCOSS-2: /lfs/h2/emc/global/noscrub/lin.gan/archive/da-dev16-ecf

I will find the location of HPSS archive and let you know.

@HelinWei-NOAA Here is the HPSS archive of the parallell: /NCEPDEV/emc-global/5year/lin.gan/WCOSS2/scratch/da-dev16-ecf

emilyhcliu commented 2 years ago

Observation monitoring pages are up and running for the v16.3.0 parallel experiments:

lgannoaa commented 2 years ago

July 14th, Cactus has system degradation issue. NCO halt the Cactus system. This impacted the parallel into halt and incomplete transfer jobs. Action: Rerun jobs as needed. DELETE_COM_IN_ARCHIVE_JOB="NO" is in place until parallel run smooth. 2021122412 gfs arch incomplete; rerun as need. 2021122500 gfs arch incomplete; rerun as need. 2021102512 efcs jobs are discovered unstable due to system error. Jobs are have been rerun. DELETE_COM_IN_ARCHIVE_JOB="YES" is set after rerun completed. 2021102518 eupd impacted with system error become zombie. Rerun in place. Cactus is still unstable therefore, set DELETE_COM_IN_ARCHIVE_JOB="NO" until parallel run smooth.

example of failed eupd (zombie) job from system issue: enkfgdas_update_06.o8587654 sed: can't read /var/spool/pbs/aux/8587654.cbqs01: No such file or directory sed: can't read /var/spool/pbs/aux/8587654.cbqs01: No such file or directory grep: /tmp/qstat.8587654: No such file or directory grep: /tmp/qstat.8587654: No such file or directory grep: /tmp/qstat.8587654: No such file or directory grep: /tmp/qstat.8587654: No such file or directory grep: /tmp/qstat.8587654: No such file or directory grep: /tmp/qstat.8587654: No such file or directory grep: /tmp/qstat.8587654: No such file or directory 000 - nid001054 : Job 8587654.cbqs01 - DEBUG-DMESG: Unable to find NFS stats file: /tmp/nfsstats.8587654.cbqs01 000 - nid001054 Job 8587654.cbqs01 - DEBUG-DMESG: Unable to find Mount stats file: /tmp/mntstats.begin.8587654.cbqs01

@arunchawla-NOAA @dtkleist @emilyhcliu @aerorahul The Cactus system degradation issue remains. This parallel proceeding very slow. Many jobs need to be rerun. Many jobs become zombie. I will execute touch commend to renew all files in ptmp by noon on July 15th to ensure files in COM not deleted.

Execute touch on COM.

lgannoaa commented 2 years ago

Set DELETE_COM_IN_ARCHIVE_JOB="YES" after 20211025 rerun completed. HPSS transfer remain slow cause archive jobs hit wall clock limit need to be rerun. parallel proceeding still remain slow.

lgannoaa commented 2 years ago

Due to the extreme slowness of the HPSS transfer rate which caused archive jobs failure. A redesign of the archive job is in progress. The prototype has been tested on July 15th for a single cycle. It provided much higher performance compare to the original archive job design architecture. Due to the system degradation issue on July 15th, METPlus (part of the verification jobs) jobs taking too long to finish. Jobs hit wallclock need to be rerun. It may be better to run METPlus in offline mode. These two modification will be in testing starting on Monday July 18th. Put in place after testing complete with satisfaction.

Cactus performance over the weekend of July 16th and 17th has improved a little bit. The HPSS transfer is still remain slow. The plan to include the above two changes is still on going.

Starting 2021103000 a two cycle testing on new designed archive jobs and offline METPlus cron task is in place. DELETE_COM_IN_ARCHIVE_JOB="NO" is set for the duration of this test. The status of performance improvement is: The original speed for 20211029 18Z and 20211028 18Z is 17 hours. -rw-r--r-- 1 lin.gan emc 2.5M Jul 17 18:35 gfs_emc_arch_18.o8694471 -rw-r--r-- 1 lin.gan emc 2.6M Jul 18 11:18 gfs_emc_arch_18.o8711811

As of July 19th 10:00AM, the test for the above two changes was in place for CDATE 2021103000 ~ 2021103106. The initial test result was good, however the wcoss2 hpss transfer slowness issue in the afternoon caused many test jobs to fail. The parallel on halt on CDATE 2021103106. The COM size is 91 TB waiting for the archive job to success to resume COM clean up and parallel job to proceed.

As of July 20th, these newly designed archive jobs have been running without issue. The HPSS transfer speed became slow again in the afternoon into the evening. This time, the parallel was running too advance to the point it has to be halt to wait for archive jobs to finish. The COM usage was too high. The archive jobs need to finish for clean up jobs to clear COM.

At 3:00P EST July 21th, a check on archive jobs shown all caught up to the speed of the parallel. Therefore, the the next time parallel resume to run on CDATE 2021110212 the COM cleanup job will be enabled.

Tag: @arunchawla-NOAA @dtkleist @emilyhcliu @aerorahul for your knowledge

lgannoaa commented 2 years ago

@emilyhcliu HPSS has system error in the past few hours. Many archive jobs with error status=141. Parallel is on halt as of July 18th 10:00PM until HPSS system issue resolved.

lgannoaa commented 2 years ago

We do not have access to Cactus from July 19th 11-15Z, and again from 20-00Z due to system upgrades and test. NCO updated the notice to extend outage until 18Z on July 19th.