NOAA-EMC / GSI

Gridpoint Statistical Interpolation
GNU Lesser General Public License v3.0
64 stars 146 forks source link

Preparation for turning on CrIS NPP radiances and include IASI Metop-C in operational GFS system #186

Closed emilyhcliu closed 2 years ago

emilyhcliu commented 2 years ago

Timeline of Issus and Actions related to CrIS NPP

Official messages from NESDIS regarding the status CrIS NPP data

Preparation for CrIS NPP (switch back to side-1 for LW + SW)

Plan-A
The plan to cope with the scenario that we have to re-estimate the
correlated observation errors due to a perceivable change in data quality.

In this case, we should run cycled experiments in low resolution to get
the new estimation for CrIS NPP (We had previously verified that the
difference in error estimation from high-resolution and low resolution
is not significant. So we can use the low-resolution estimation in 
high-resolution run).

Plan-B
For the best-case scenario where the data quality from side-1 is good and
has similar characteristics as those from side-2, we can do the following:
(1) turn off water vapor channels (set iuse flag to -2; do not use the data at all).
(2) use the existing obs error matrix and get rid of the rows/columns that 
       correspond to H2O channels
(3) run a single-cycle experiment with conditions from (1) and (2)
      together and validate
(4) we will run a cycled experiment if time and resources allow.  

To do list:

Notes for abias and abias_pc (bias and pc files)

Background:

What to do with the bias and pc values for CrIS NPP? --- we are going for Method#3

Method #1 Give NCO an updated bias and pc files with changes in CrIS NPP only Use the bias and pc values for CrIS NPP channels from the cycle (20200521 12z) right before the CrIS NPP side-2 issue

Method #2 Give NCO an updated bias and pc files with changes in CrIS NPP only (1) For LW channels: copy the bias coefficients and pre-conditioning values from the cycle (20200521 12Z) right before the CrIS NPP side-2 issue (2) For MW and SW channels set both bias coefficients and pc to zero (3) Remove CrIS NPP diagnostic file from the radstat (for the first cycle)

Method #3 Give NCO an updated bias and pc files with changes in CrIS NPP only (1) Start bias and pc estimation from zero for CrIS NPP, if this is allowed in the operational system (2) Remove CrIS NPP diagnostic file from the radstat (for the first cycle)

We better do a drill run for a few cycles before we hand these changes to NCO.

emilyhcliu commented 2 years ago

@RussTreadon-NOAA Please take a look at this issue when you have time. We can talk about it when you come back from your leave.

emilyhcliu commented 2 years ago

Update CrIS NPP iuse_rad flags in global_satinfo.txt: LW channels (CO2 channels 1 - 713): turn on (same iuse_rad flag as CrIS N20) MW channels (H2O channels 714 - 1578): set to -2 (do not use) SW channels (Solar-affected channels 1579 - 2211): set to -1 (monitoring)

Updated global_satinfo.txt can be found in the following ProdGSI branch: ProdGSI Branch: rev2_crisnpp

emilyhcliu commented 2 years ago

@KristenBathmann-NOAA performed single-cycle test to check if the correlated observation errors work properly with the LW+SW CrIS NPP data from side-1. Here is the summary of the result from @KristenBathmann-NOAA:

I ran two single-cycle tests, the 2020121618 gdas analysis, and the 2020122700 gfs analysis. 
I set iuse to -2 for all CrIS npp MW channels (channels 714-1570). 
This reduces the number of active channels from 100 to 92. 
I've attached plots of the gdas cycle obs locations, and the gdas and gfs cycles OMA statistics. 
These results are compared to my control. 
Everything looks as it should.
However, updates to the CrIS npp covariance file are necessary to run with correlated error.
lwswlocs origlocs

This is the result from GFS cycle. Here is the result from GDAS cycle

origlocs
emilyhcliu commented 2 years ago

@MichaelLueken-NOAA Is it possible to add multiple assignees to this issue? If so, please add the following people: @KristenBathmann-NOAA @HaixiaLiu-NOAA @RussTreadon-NOAA

emilyhcliu commented 2 years ago

Messages from NESDIS

Administrative: Update #4 - CrIS on SNPP switch to Side 1 on July 12, 2021 - Issued July 21, 2021 1900 UTC

Update #4: SNPP CrIS SDR products have been declared Provisional following the SDR Science review team approval on July 21, 2021. SNPP CrIS SDR data flow through ESPC will resume on July 22, 2021 at 1400 UTC. The following three IDPS products will resume: CrIS-FS-SDR, CrIS-SDR-GEO, and CRIS-SCIENCE-RDR. Additionally, the following BUFR products will resume: CrIS_C0431_BUFR and CrIS_C2211_BUFR.

Update #3: The SNPP CrIS Side Switch Activities have been successfully completed and the Side-1 LWIR and SWIR bands are functional, while the MWIR band is non-operational. The SNPP CrIS SDR products will be declared Provisional on Wednesday July 21, 2021 following the SDR Science review team approval. EDRs will be available once validated by the science team.

Update #2: Day 2 activities have been completed as planned. Remaining activities are on schedule to be completed on July 14th.

Update #1: Day 1 commanding activities have been completed as planned. The remaining activities are on schedule to be conducted on July 13-14, 2021 as outlined in the original notification.

KristenBathmann commented 2 years ago

Emily added the stats plot from the gfs cycle that I ran. Here is the plot from the gdas cycle. crisnpp_stats_gdas_2020121618

Removing the MW channels requires updates to the binary covariance file $FIXgsi/Rcov_crisnpp See the end of $GSI/util/Correlated_Obs/cov_calc.f90 to understand the format of this file. It contains nch_active (number of actively assimilated channels, as specified in the satinfo) and nctot (total number of channels in the satinfo), which should be updated. It also contains indR, which are indices of the active channels, and Rcov, the covariance matrix. The MW channels need to be deleted from indR and Rcov. I have a fortran program that can make these changes.

emilyhcliu commented 2 years ago

@KristenBathmann-NOAA Thanks for performing the single-cycle test and the explanation of the required change for the R matrix. I am setting up a drill parallel run to extend your single-cycle test. I am using v16x_sept_ctl as control (reference) and v16x_sept_cris as an experiment to test the following: (1) LW+SW (no MW) --- updated satinfo (2) Updated correlated observation error file for CrIS --- updated Rcov_crisnpp

The parallel experiment setup can be found at the following location on HERA: /scratch1/NCEPDEV/da/Emily.Liu/para/v16x/v16x_sept_cris

The updated in satinfo and Rcov_crisnpp files are collected under the following directory: /scratch1/NCEPDEV/da/Emily.Liu/para/v16x/v16x_sept_cris/fix_gsi

Please let me know when you have the updated Rcov_crisnpp. I will kick off the run after we have the updated Rcov_crisnpp. We do not have to run long (maybe a week) and then run GSI statistics check against the control.

emilyhcliu commented 2 years ago

@MichaelLueken-NOAA Is it possible to add multiple assignees to this issue? If so, please add the following people: @KristenBathmann-NOAA @HaixiaLiu-NOAA @RussTreadon-NOAA

@MichaelLueken-NOAA Thank you!

KristenBathmann commented 2 years ago

The new file is: /scratch1/NCEPDEV/da/Kristen.Bathmann/archive/v16rt2/Rcov_crisnpp_lwsw You should rename it to Rcov_crisnpp when you copy it to the fix directory. After your first analysis run, please pause the experiment so that I confirm it is set up correctly. Do this for the first gdas analysis, and the first gfs analysis. I will just need to look at the log files.

emilyhcliu commented 2 years ago

@KristenBathmann-NOAA The drill run is running now with updated satinfo and your revised R file for CrIS NPP.
EXPDIR:/scratch1/NCEPDEV/da/Emily.Liu/para/v16x/v16x_sept_cris ROTDIR:/scratch1/NCEPDEV/stmp4/Emily.Liu/ROTDIRS/v16x_sept_cris

The first GDAS analysis is 2020083006 The first GFS analysis is 2020083100

RussTreadon-NOAA commented 2 years ago

@RussTreadon-NOAA Please take a look at this issue when you have time. We can talk about it when you come back from your leave.

Thank you, Emily, Kristen, and Mike, for working together on this. My first day back at work is 7/29. What is the timeline for getting these changes into operations? Does this implementation only change fix files or are there other changes, too? Have we reached out NCO?

The Hera parallel is C384/C192, right? Is lo-res testing sufficient before we pass these changes to NCO? We may be able to squeeze an operational resolution short duration parallel on the production Dell. We don't want any surprises when these changes are implemented in operations.

KristenBathmann commented 2 years ago

Correlated error seems to be configured correctly for the 2020083006 gdas analysis

emilyhcliu commented 2 years ago

@KristenBathmann-NOAA The drill run is running the first GDAS analysis.
Here is the gdas log file for you to check: /scratch1/NCEPDEV/stmp4/Emily.Liu/ROTDIRS/v16x_sept_cris/logs/2020083006/gdasanal.log

emilyhcliu commented 2 years ago

@RussTreadon-NOAA Please take a look at this issue when you have time. We can talk about it when you come back from your leave.

Thank you, Emily, Kristen, and Mike, for working together on this. My first day back at work is 7/29. What is the timeline for getting these changes into operations? Does this implementation only change fix files or are there other changes, too? Have we reached out NCO?

The Hera parallel is C384/C192, right? Is lo-res testing sufficient before we pass these changes to NCO? We may be able to squeeze an operational resolution short duration parallel on the production Dell. We don't want any surprises when these changes are implemented in operations.

Hello, @RussTreadon-NOAA First, about the timeline, NESDIS declared the CrIS NPP data (switching to side-1) provisional yesterday, and began to release data today. The data is provisional pending user feedback. So, we do have some time to work on this.

The single-cycle test done by Kristen is to find out if there is any issue in using the existing correlated observation error with the updated satinfo file (turning on LW channels, setting the missing MW channels to -2, and setting SW channels to -1) and the revised CrIS NPP data (LW+SW only). The test indicated that the MW channels should be removed from the Rcov file. The OMF and OMA are comparable with the single-cycle control run.

The drill parallel experiment (low-resolution) is to find out if the updated satinfo and Rcov for CrIS NPP work properly in the cycling run. And we have the control experiment (v16x_sept_ctl) as a reference for verification. We will keep this drill experiment running for a few days, and then do statistics check with the control experiment.

A few weeks ago, Kristen had made an assessment regarding the impact of model/analysis resolution on the Rcov estimation. The conclusion is that the impact is small and we could use the Rcov estimation from a high-resolution run in the low-resolution experiment and vice versa.

We need to think about abias and pc files for turning on the revised CrIS NPP data. Please see the Notes I wrote in the description section of this issue. It would be great if we can run a high resolution run for the changes we will hand over to NCO: (1) updated satinfo for CrIS NPP (2) updated Rcov for CrIS NPP (3) abias and pc files

The single-cycle run and the drill parallel run are to make sure the items (1) and (2) work without problem.

emilyhcliu commented 2 years ago

Correlated error seems to be configured correctly for the 2020083006 gdas analysis

Yeh! Thanks for checking @KristenBathmann-NOAA I will let you know when we have the first GFS analysis (2020083100)

I will add your revised Rcov_crisnpp to the ProdGSI branch rev2_crisnpp. So, we will have two pieces of required changes in one branch.

RussTreadon-NOAA commented 2 years ago

@RussTreadon-NOAA Please take a look at this issue when you have time. We can talk about it when you come back from your leave.

Thank you, Emily, Kristen, and Mike, for working together on this. My first day back at work is 7/29. What is the timeline for getting these changes into operations? Does this implementation only change fix files or are there other changes, too? Have we reached out NCO?

The Hera parallel is C384/C192, right? Is lo-res testing sufficient before we pass these changes to NCO? We may be able to squeeze an operational resolution short duration parallel on the production Dell. We don't want any surprises when these changes are implemented in operations.

@RussTreadon-NOAA Please take a look at this issue when you have time. We can talk about it when you come back from your leave.

Thank you, Emily, Kristen, and Mike, for working together on this. My first day back at work is 7/29. What is the timeline for getting these changes into operations? Does this implementation only change fix files or are there other changes, too? Have we reached out NCO? The Hera parallel is C384/C192, right? Is lo-res testing sufficient before we pass these changes to NCO? We may be able to squeeze an operational resolution short duration parallel on the production Dell. We don't want any surprises when these changes are implemented in operations.

Hello, @RussTreadon-NOAA First, about the timeline, NESDIS declared the CrIS NPP data (switching to side-1) provisional yesterday, and began to release data today. The data is provisional pending user feedback. So, we do have some time to work on this.

The single-cycle test done by Kristen is to find out if there is any issue in using the existing correlated observation error with the updated satinfo file (turning on LW channels, setting the missing MW channels to -2, and setting SW channels to -1) and the revised CrIS NPP data (LW+SW only). The test indicated that the MW channels should be removed from the Rcov file. The OMF and OMA are comparable with the single-cycle control run.

The drill parallel experiment (low-resolution) is to find out if the updated satinfo and Rcov for CrIS NPP work properly in the cycling run. And we have the control experiment (v16x_sept_ctl) as a reference for verification. We will keep this drill experiment running for a few days, and then do statistics check with the control experiment.

A few weeks ago, Kristen had made an assessment regarding the impact of model/analysis resolution on the Rcov estimation. The conclusion is that the impact is small and we could use the Rcov estimation from a high-resolution run in the low-resolution experiment and vice versa.

We need to think about abias and pc files for turning on the revised CrIS NPP data. Please see the Notes I wrote in the description section of this issue. It would be great if we can run a high resolution run for the changes we will hand over to NCO: (1) updated satinfo for CrIS NPP (2) updated Rcov for CrIS NPP (3) abias and pc files

The single-cycle run and the drill parallel run are to make sure the items (1) and (2) work without problem.

Got it. Thank you, Emily.

KristenBathmann commented 2 years ago

The gfsanal also appears to be set up correctly.

emilyhcliu commented 2 years ago

The gfsanal also appears to be set up correctly.

@KristenBathmann-NOAA Thank you for checking. I will let the drill run continue for a few cycles more and then I will do the gsistat plots for a sanity check.

emilyhcliu commented 2 years ago

These are the GSI statistics plots based on gsistat output: There is a degradation in the RMSE for winds. The RMSE gap should become smaller as we cycling the run longer. We will continue the parallel experiment.

gsistat_uvtq_Bias gsistat_uvtq_RMSE gsistat_uvtq_Count

emilyhcliu commented 2 years ago

The LW+SW data flow is on now.
To do list:

HaixiaLiu-NOAA commented 2 years ago

I generated the RadMon plots for the period from 20210510 to 20210731 covering before and after CrIS_NPP side switches. This longer time series show different OmFnbc stats. Here is an example plot for a surface channel 194.

OmFnbc bias plot: image

OmFnbc standard deviation: image

HaixiaLiu-NOAA commented 2 years ago

Here is the link to the RadMon plots https://www.emc.ncep.noaa.gov/gmb/wx20hl/radmon.opr.CrIS_NPP/

emilyhcliu commented 2 years ago

@KristenBathmann-NOAA @HaixiaLiu-NOAA @RussTreadon-NOAA I know that we will combine the IASI-C into the CrIS NPP work.
During our Friday meeting, the decisions made for IASI-C are the following: (1) Use IASI-B correlation obs error estimation for IASI-C (2) Do we want to keep IASI-C in monitoring mode or we will turn it on??

@Kristen what is the status of your parallel experiment for replacing IASI-AB with IASI-BC?

HaixiaLiu-NOAA commented 2 years ago

@emilyhcliu @KristenBathmann-NOAA

I plan to turn IASI-C on in the parallel. My understanding is this high-reso parallel is going to be used to estimate the Rcov for IASI-C.

I have 2 questions about setup the parallel.

1, I have zeroed out bias correction coeff for CrIS_NPP, do we want to zero out the bias correction coeff for IASI-C as well? I plan to not zero out those coeff for IASI-C. 2, do I need to zero out abias_int?

KristenBathmann commented 2 years ago

My low resolution experiment turns on IASI-C and turns off IASI-A. IASI-B and both CrIS use updated covariance matrices computed from recent operations. IASI-C uses the new IASI-B matrices. The experiment is still in the beginning of September, so it has a while to go.

emilyhcliu commented 2 years ago

@emilyhcliu @KristenBathmann-NOAA

I plan to turn IASI-C on in the parallel. My understanding is this high-reso parallel is going to be used to estimate the Rcov for IASI-C.

I have 2 questions about setup the parallel.

1, I have zeroed out bias correction coeff for CrIS_NPP, do we want to zero out the bias correction coeff for IASI-C as well? I plan to not zero out those coeff for IASI-C. 2, do I need to zero out abias_int?

@HaixiaLiu-NOAA @KristenBathmann-NOAA (1) If we are confident about the quality of the IASI-C bias coefficients in the operational run, we can just use them. @HaixiaLiu-NOAA, could you compare the number of observation pass QC between IASI-C and IASI-A in our operational run (check the gsistat). The number of obs passed QC should be similar. If they are, then the current IASI-C bias coefficients should be OK.

(2) The abias_int will be updated automatically if the bias coefficients were initialized or went through the initialization routine. So, there is no need to update abias_int.

@HaixiaLiu-NOAA Could you point me to the experiment and rotating/running directories after you are ready to run. I will also check the output.

emilyhcliu commented 2 years ago

@HaixiaLiu-NOAA You can find the updated satinfo and Rcov for cris npp in my home directory on HERA: /home/Emily.Liu/4Haixia

The IASI-C is still passive. Need to change the use flag for IASI-C.

HaixiaLiu-NOAA commented 2 years ago

I have a single-cycle test script on WCOSS: /gpfs/dell2/emc/modeling/save/Haixia.Liu/stand_alone/4IasiCris/rungsi.sh (based on Emily's rungsi.sh). Modifications I made include new files satinfo, anavinfo, abias, abias_pc and Rcov_crisnpp.

Here is the run directory: /gpfs/dell2/ptmp/Haixia.Liu/tmp766/4iasic.highres.2021080206

I realized that I forgot to remove diag_cris-fsr_npp file from the radstat file. So I have submitted another single-cycle test to correct this.

HaixiaLiu-NOAA commented 2 years ago

@emilyhcliu Here are the data counts from gsistat:

o-g 01 rad metop-a iasi 262447416 6212744 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad metop-b iasi 323473920 6283344 960926 0.36117E+06 0.36117E+06 0.37586 0.37586
o-g 01 rad metop-c iasi 265410376 6281950 966844 0.35007E+06 0.35007E+06 0.36207 0.36207

IASI-B and IASI-C have comparable data counts and penalties

HaixiaLiu-NOAA commented 2 years ago

Here is the data count from the control analysis which has IASI-A on and IASI-C monitored.

o-g 01 rad metop-a iasi 262447416 6212744 984052 0.39023E+06 0.39023E+06 0.39655 0.39655

It is comparable to IASI-C as well.

I just asked Ed Safford to help add IASI-metop-c into the radiance monitoring instrument list. After he adds that, I can take a look at the IASI-C summary plot and compare with other IASI summary plots to see if the bias correction performs well.

KristenBathmann commented 2 years ago

Correlated error appears to be configured correctly in this run.

HaixiaLiu-NOAA commented 2 years ago

Thank you @KristenBathmann-NOAA for checking and confirming.

HaixiaLiu-NOAA commented 2 years ago

Ed helped add the iasi_metop-c into the radiance monitoring instrument list. Here is the summary plot for iasi-c

image

HaixiaLiu-NOAA commented 2 years ago

here is the same plot for iasi-b

image

HaixiaLiu-NOAA commented 2 years ago

OmF after BC for IASI-C:

image

IASI-B:

image

emilyhcliu commented 2 years ago

Looks like IASI-C bias correction is behaving like IASI-B.

HaixiaLiu-NOAA commented 2 years ago

Update on CrIS_NPP data status from ESPC:

The Environmental Satellite Processing Center (ESPC) will resume distributing CrIS products through the Product Distribution and Access (PDA) at 2130 UTC August 9, 2021. Direct Broadcast (DB) users can use S-NPP ATMS, VIIRS, and CrIS data received through the High Rate Data (HRD) downlink for operations.

RussTreadon-NOAA commented 2 years ago

global-workflow

release/gfs.v16.1.3 has been renamed release/gfs.v16.1.2.1. The v16rad config.base was updated accordingly. Turning on CrIS NPP and assimilating IASI Metop-C in the operational GFS should not require workflow changes apart from an updated DA tag. This implementation will increment the operational gfs_ver. Given this, it will likely be necessary to pass NCO a new workflow tag containing the updated DA tag.

Adding @KateFriedman-NOAA for her awareness and to get her input regarding which workflow branch to use for this pre-implementation DA development.

KateFriedman-NOAA commented 2 years ago

@RussTreadon-NOAA The global-workflow release/gfsv16.1.2.1 branch is just for the TAC2BUFR obsproc package version updates in config.base and will be merged into the operations branch next week after the Monday TAC2BUFR implementation. Since it doesn't appear that NCO is incrementing the GFS version with the TAC2BUFR update, the version number I've given it is just for us (EMC) so we can note that the upstream component was updated. Sounds like this issue is a different upgrade and will increment the version in ops...therefore I can initiate a new workflow branch to update the DA tag within and prep for hand-off to NCO.

Are you able to confirm what the new gfs_ver will be for this update yet? I'll make the corresponding release branch. Please also open a new global-workflow issue with whatever details you have right now and feel free to assign me to it. I'll take it from there, thanks!

HaixiaLiu-NOAA commented 2 years ago

I got a parallel to run the analysis step successfully. The purpose of this parallel is for a sanity check. It was warm started from Russ's v16rad experiment. I let it go only for one cycle. No modification was made respect to v16rad and I got identical gsistat as expected. Then I made the following modifications to test IASI-C and CrIS-fsr_NPP. 1, updated Rcov for IASI-B, IASI-C, CrIS_N20 and CrIS_NPP from Kristen 2, new satinfo (iasi-c on and iasi-a off, cris-fsr_npp on for LW only) 3, new anavinfo (to use correct Rcov for iasi-c) 4, zeroed out bias correction for cris-fsr_npp: abias and abias_pc files 5, removed diag_cris-fsr_npp_ges.cdate.nc from the radstat file to properly initialize bias correction for cris-fsr_npp

I reran the anal step and results are in the following directories on WCOSS. original anal: /gpfs/dell2/ptmp/Haixia.Liu/ROTDIRS/v16_iasicris/gdas.20210810/06.orig/atmos/gdas.t06z.gsistat new anal: /gpfs/dell2/ptmp/Haixia.Liu/ROTDIRS/v16_iasicris/gdas.20210810/06/atmos/gdas.t06z.gsistat

@RussTreadon-NOAA @emilyhcliu @KristenBathmann-NOAA would you please take a look at these two gsistat files. I have one question for Kristen on these results. There are less cris_n20 and iasi_metop-b data being assimilated and larger penalty in the new run compared to the old run. I think this is caused by the updated Rcov. Is this what you expect?

old: o-g 01 rad metop-b iasi 294064848 6156412 948654 0.33703E+06 0.33703E+06 0.35527 0.35527 o-g 01 rad n20 cris-fsr 415096962 5376122 462235 0.16569E+06 0.16569E+06 0.35845 0.35845

new: o-g 01 rad metop-b iasi 294064848 6156412 940661 0.36270E+06 0.36270E+06 0.38558 0.38558 o-g 01 rad n20 cris-fsr 415096962 5376122 461173 0.22097E+06 0.22097E+06 0.47914 0.47914

Thank you for checking in advance.

I am going to start a real parallel after tomorrow's production switch.

KristenBathmann commented 2 years ago

The changes in obs counts are expected, and reasonable. It is because the cloud detection depends on observation errors, which got updated with the covariances. However, correlated error did not get initialized in the new run for CrIS NPP so something is not right. I'll look into it.

HaixiaLiu-NOAA commented 2 years ago

@KristenBathmann-NOAA The CrIS NPP is not assimilated for this cycle. This was done on purpose because I took out the diag_cris-fsr_npp_ges.2021081000.nc file from the radstat. This is to properly initialize the CrIS NPP data with zeroed out bias correction coeff. If the diag for CrIS NPP existed, the CrIS NPP would be assimilated with zero bias correction which would be wrong.

RussTreadon-NOAA commented 2 years ago

We need to get changes for a commercial gpsro upgrade into operations in addition to the CrIS NPP & Metop-C IASI changes. The order of these various implementations isn't clear.

TAC2BUFR does not require a new DA tag. TAC2BUFR is an ObsProc implementation. ObsProc exists outside the GFS.

There is a disconnect between how the operational gfs and developer parallels specify the global ObsProc version to use. GFS jobs obtain version numbers from /gpfs/dell1/nco/ops/nwprod/versions/gfs.ver. This file does not specify ObsProc versions. Global ObsProc versions are specified in /gpfs/dell1/nco/ops/nwprod/versions/obsproc_global.ver. In contrast, developer parallels specify the global ObsProc versions in config.base (specifically, config.base.emc.dyn).

TAC2BUF, an ObsProc implementation, requires EMC parallels to update ObsProc versions in config.base. That is, there is a change in the global workflow. NCO's implementation of the same package will update obsproc_global.ver. There is no change in the global (gfs) workflow.

On Wed, Aug 11, 2021 at 10:05 AM Kate Friedman @.***> wrote:

@RussTreadon-NOAA https://github.com/RussTreadon-NOAA The global-workflow release/gfsv16.1.2.1 branch is just for the TAC2BUFR obsproc package version updates in config.base and will be merged into the operations branch next week after the Monday TAC2BUFR implementation. Since it doesn't appear that NCO is incrementing the GFS version with the TAC2BUFR update, the version number I've given it is just for us (EMC) so we can note that the upstream component was updated. Sounds like this issue is a different upgrade and will increment the version in ops...therefore I can initiate a new workflow branch to update the DA tag within and prep for hand-off to NCO.

Are you able to confirm what the new gfs_ver will be for this update yet? I'll make the corresponding release branch. Please also open a new global-workflow issue with whatever details you have right now and feel free to assign me to it. I'll take it from there, thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/GSI/issues/186#issuecomment-896856000, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGNN637VFF7FHL3MZZ3XK6DT4J7R7ANCNFSM5AYQB5WQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

HaixiaLiu-NOAA commented 2 years ago

@KristenBathmann-NOAA Is it true that the correlated error is not initialized if the iuse=-1, which is the case for CrIS NPP now. Since the iuse is reset to -1 for CrIS NPP. I will let the parallel run for another cycle so that you can check the 2nd cycle correlated error.

HaixiaLiu-NOAA commented 2 years ago

FYI, a few updates on this.

1, RadMon shows the CrIS NPP data is stable after the data came back on 2021081000. image

image

2, Kristen helped me check the analysis log for the 2nd cycle on VENUS before the production switch and confirmed correlated error is configured correctly.

3, I am now preparing to start a real-time parallel test for IASI-C and CrIS NPP data on Mars. I am going to start 2 parallels. The 1st one is my control without changing anything and the 2nd one is my experiment with changing to several fix files. I will compare the control and experiment to confirm results are as expected.

Please let me know if you have comments on this. I will update the status once the results come out.

HaixiaLiu-NOAA commented 2 years ago

I have started the real-time parallels (control and experiment) on Mars. The control only runs for 1 cycle and is used to compare with the experiment. Here are the key directories for the experiment.

EXPDIR:  /gpfs/dell2/emc/modeling/save/Haixia.Liu/para_v16/v16iasicris2 HOMEgfs:  /gpfs/dell2/emc/modeling/noscrub/emc.glopara/git/global-workflow/release_gfsv16.1.2.1 ROTDIR:  /gpfs/dell2/ptmp/Haixia.Liu/ROTDIRS/v16iasicris2 ARCDIR:  /gpfs/dell2/emc/modeling/noscrub/Haixia.Liu/archive/v16iasicris2 ATARDIR:  /NCEPDEV/emc-da/5year/Haixia.Liu/WCOSS_D/gfsv161/v16iasicris2

The control has the name v16_iasicris (sorry for the confusing names).

I compared the 2 following gsistat files: /gpfs/dell2/ptmp/Haixia.Liu/ROTDIRS/v16iasicris2/gdas.20210812/12/atmos/gdas.t12z.gsistat and /gpfs/dell2/ptmp/Haixia.Liu/ROTDIRS/v16_iasicris/gdas.20210812/12/atmos/gdas.t12z.gsistat

The differences between the 2 gsistat files are as expected.

The parallel with changes for IASI-C and CrIS will be kept running on Mars.

Please take a look at the results and let me know if you find anything strange. Thank you.

RussTreadon-NOAA commented 2 years ago

RFC 8521

ObsProc upgrade (RFC 8521) was to be implemented 2021081012. This was postponed and rescheduled to 2021081612 due to Critical Weather Day. A new CWD was declared for the period 2300Z Thu Aug 12 2021 to 1200Z Tue Aug 17 2021. This will likely further postpone implementation of RFC 8521.

v16iasicris2 $EXPDIR/config.base contains logic to switch to RFC 8554 ObsProc starting 2021081612.

if [[ "$CDATE" -ge "2021081612" ]]; then
    export HOMEobsproc_prep="$BASE_GIT/obsproc/obsproc_prep_RB-5.5.0"
    export HOMEobsproc_network="$BASE_GIT/obsproc/obsproc_global_RB-3.4.2"
fi

2021081612 needs to be updated once the new implementation date for RFC 8521 is known. Care should be taking to not use the RFC 8521 ObsProc package before it is implemented in operations. This ObsProc package changes global DA results.

v16iasicris2 must switch to the RFC 8521 ObsProc package in the same cycle that operations exercises the RFC 8521 ObsProc package.

HaixiaLiu-NOAA commented 2 years ago

@RussTreadon-NOAA Thank you for checking. I will stop the parallel on 2021081606 cycle and wait for the decision-making of the RFC8521 implementation date.

HaixiaLiu-NOAA commented 2 years ago

I encountered a segmentation fault at gdasprep step at 2021081306 for my parallel experiment v16iasicris2 on Mars. I warm started this parallel from 2021081212 gdasprep step. The first 3 cycles (12z and 18z cycles on 8/12 and 00z cycle on 8/13) ran to complete except for the metplus at 2021081300. Then the gdasprep step had a segmentation fault at 06z 8/13 cycle. I checked the log file: /gpfs/dell2/ptmp/Haixia.Liu/ROTDIRS/v16iasicris2/logs/2021081306/gdasprep.log and found the following error that may be related to the segmentation fault. "The foreground exit status for PREPOBS_PREPACQC is  174" 

I compared this log file with the 2021081218 gdasprep log file but could not figure out why I got the err=174 for this 06z cycle but not the 18z cycle.

@RussTreadon-NOAA do you have an idea of what happened at the 2021081306 gdasprep step (segmentation fault or err=174 in the log file)? Thank you

RussTreadon-NOAA commented 2 years ago

Numerous operational prep jobs failed 2021081306. This was traced to an increase in the number of aircraft observations, most likely AMDAR from China. See item 1 of the 8/13/2021 SDM log for details.

A check of operational GFS atmos_prep log files shows a version change for obsproc_prep from v5.4.0 to v5.4.0a. Comparison of $NWROOTp3/obsproc_prep.v5.4.0 and $NWROOTp3/obsproc_prep.v5.4.0a shows a change in compiler flags in sorc/prepobs_prepacqc.fd/makefile.

v5.4.0 has

FFLAGS = -O2 -convert big_endian -list -assume noold_ldout_format $(DEBUG) $(DEBUG2)

whereas v5.4.0.a has

FFLAGS = -O0 -heap-arrays -convert big_endian -list -assume noold_ldout_format $(DEBUG) $(DEBUG2)

As a test run 2021081306 gdasprep in v16rad using

HOMEobsproc_prep="$BASE_GIT/obsproc/gfsv16b/obsproc_prep_RB-5.4.0"

prepobs_prepacqc seg faulted as it did in operations.

Do the following:

  1. copy $BASE_GIT/obsproc/gfsv16b/obsproc_prep_RB-5.4.0 to $BASE_GIT/obsproc/gfsv16b/obsproc_prep_RB-5.4.0a
  2. modify prepobs_prepacqc.fd/makefile as done in $NWROOTp3/obsproc_prep.v5.4.0a
  3. recompile prepobs_prepacqc
  4. set export HOMEobsproc_prep="$BASE_GIT/obsproc/gfsv16b/obsproc_prep_RB-5.4.0a" in v16rad config.base

Rerun 2021081306 gdasprep. Job ran to completion.

It is not clear from the SDM log if the above compiler change is the final solution or a patch.

@ShelleyMelchior-NOAA is added to the issue for guidance. Shelley, three questions:

@KateFriedman-NOAA added to keep her in the loop since other developers may encounter the same problem. The final solution, update HOMEobsproc_prep, will be implemented in the global workflow, not DA.