NOAA-EMC / JEDI-T2O

JEDI Transition to Operations activities.
GNU Lesser General Public License v2.1
5 stars 4 forks source link

Sonde UFO validation #91

Open ADCollard opened 1 year ago

ADCollard commented 1 year ago

This issue is splitting from #60 so the sonde and aircraft validation does not get confused.

The starting point for this is the GMAO YAML, but there are many issues with these it appears.

The test YAMLs used here are in this branch.

This issue concerns temperature, humidity and wind only as surface pressure is being addressed here.

ADCollard commented 1 year ago

Differences for sondes

HofX is very close. ~1.e-9 for spec humidity and 1.e-5K for temperature. Some residual error for winds near the surface. Some weird plots for errors. Need to dig further.

hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windNorthward hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windEastward hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_specificHumidity errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windNorthward errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windEastward errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_specificHumidity errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_airTemperature

ADCollard commented 1 year ago

Interestingly the discrepancies in the wind hofx do not show up in the histogram plots (too few points?)

hofxdiff_histogram_diag_sondes_2021080100_Radiosondes_windEastward hofxdiff_histogram_diag_sondes_2021080100_Radiosondes_windNorthward hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windEastward hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windNorthward

ADCollard commented 1 year ago

Map of hofx differences hofxdiff_map_diag_sondes_2021080100_Radiosondes_windEastward hofxdiff_map_diag_sondes_2021080100_Radiosondes_windNorthward

ADCollard commented 1 year ago

Regenerated the input data set from GSI. The agreement is a lot better now for winds but there are still some small discrepancies: hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windEastward hofxdiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windNorthward

ADCollard commented 1 year ago

Observation error differences remain (not surprisingly).

errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_airTemperature errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_specificHumidity errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windEastward errordiff_vs_pressure_diag_sondes_2021080100_Radiosondes_windNorthward

ADCollard commented 1 year ago

Start by focussing on the first couple of tests as a sanity check.

A good start is to understand exactly why this section of the sondes yaml is needed:

## a tempporary solution: Replace error by GsiAdjustObsError
## overwite above error assignments and inflation.
- filter: Perform Action
  filter variables:
  - name: airTemperature
  action:
    name: assign error
    error function: GsiAdjustObsError/airTemperature
  where:
  - variable:
      name: GsiAdjustObsError/airTemperature
    is_defined:
  defer to post: true
PraveenKumar-NOAA commented 1 year ago

@ADCollard while running tests using the new GMAO/YAML, I am getting following errors:

All variables / channels are bias-corrected. Error: Variable saturation_specific_humidity not found in sondes_geoval_2021080100.nc4 OOPS Ending 2023-11-02 14:02:29 (UTC-0500) *** Running UFO failed ****

Can you point me to your data files?

ADCollard commented 1 year ago

@PraveenKumar-NOAA Sorry just saw this. I am using /work2/noaa/da/acollard/UFO_eval/data/gsi_geovals_l127/nofgat_aug2021/20231030

ADCollard commented 1 year ago

FYI, the relevant issue from GMAO on the sonde.yaml is: https://github.com/GEOS-ESM/swell/issues/78

ADCollard commented 1 year ago

Looking at the ErrorAdjust level (i.e., after read_prepbufr in GSI), there is a discrepancy with the winds

Screenshot 2023-11-03 at 6 06 25 PM

This appears to be because the ObsErrorFactorConventional obsfunction is not being applied to winds (or anything other than temperature) in this yaml.

`- filter: Perform Action
  filter variables:
  - name: airTemperature
  action:
    name: inflate error
    inflation variable:
      name: ObsFunction/ObsErrorFactorConventional
      options:
        test QCflag: PreQC
        test QCthreshold: 3
        inflate variables: [airTemperature]
        pressure: MetaData/pressure
        distance threshold: -1.
  defer to post: true

Trying to work how if it can work with winds.

ADCollard commented 1 year ago

Adding the following to the YAML improves things:

# Adjusted Error after Initial Assignment
- filter: Perform Action
  filter variables:
  - name: windEastward
  action:
    name: inflate error
    inflation variable:
      name: ObsFunction/ObsErrorFactorConventional
      options:
        test QCflag: PreQC
        test QCthreshold: 3
        inflate variables: [windEastward]
        pressure: MetaData/pressure
        distance threshold: -1.
  defer to post: true

- filter: Perform Action
  filter variables:
  - name: windNorthward
  action:
    name: inflate error
    inflation variable:
      name: ObsFunction/ObsErrorFactorConventional
      options:
        test QCflag: PreQC
        test QCthreshold: 3
        inflate variables: [windNorthward]
        pressure: MetaData/pressure
        distance threshold: -1.
  defer to post: true
Screenshot 2023-11-03 at 6 28 28 PM
ADCollard commented 1 year ago

But differences are still significant:

Screenshot 2023-11-03 at 6 36 41 PM
ADCollard commented 1 year ago

To try to understand this a little better let's compare the errorfactor derived in ObsFunction/ObsErrorFactorConventional with the equivalent in GSI. There are substantial differences:

Screenshot 2023-11-08 at 1 19 11 PM Screenshot 2023-11-08 at 1 15 00 PM

We need to dig deeper to find out why this is.

ADCollard commented 1 year ago

The 'vmag' calculation definitely appears to differ.

    vmag=min(max(half*dpres(ilev),r0_02*presl(ilev)),con*plevs(k))
Screenshot 2023-11-08 at 1 59 47 PM
ADCollard commented 1 year ago

So vmag appears to agree between GSI and JEDI up to a value of ~0.2. Above that the JEDI value is up to three times larger.

ADCollard commented 1 year ago

Digging in a little closer. I have chosen a point where vmag is 0.75 for JEDI and 1.54 for GSI.

This is 35.27285N, 248.181152E, 37070Pa. The sonde ID is 72376.

The observed pressures for this sonde are:

In [90]: p_sonde
Out[90]: 
array([  880.    ,   900.    ,  1000.    ,  1000.    ,  1050.    ,
        1170.    ,  1310.    ,  1330.    ,  1350.    ,  1470.    ,
        1540.    ,  1850.    ,  1890.    ,  1940.    ,  2000.    ,
        2000.    ,  2029.9999,  2220.    ,  2340.    ,  2520.    ,
        2680.    ,  2890.    ,  2950.    ,  3000.    ,  3000.    ,
        3240.0002,  3400.    ,  3730.    ,  3770.    ,  3909.9998,
        3950.    ,  4100.    ,  4310.    ,  4520.    ,  4540.    ,
        4670.    ,  4740.    ,  5000.    ,  5000.    ,  5230.    ,
        5390.    ,  5490.    ,  5760.    ,  6000.    ,  6050.    ,
        6680.0005,  7000.    ,  7000.    ,  7019.9995,  7100.    ,
        7300.    ,  7380.0005,  7760.    ,  7890.    ,  8160.    ,
        8290.    , 46200.    , 45700.    , 44700.    , 47600.    ,
       44200.    , 43520.    , 43500.    , 48600.    , 48960.    ,
       42800.    , 49900.    , 50000.    , 50000.    , 41840.    ,
        9020.    , 50900.    ,  9260.    , 40210.    , 40000.    ,
       40000.    ,  9640.    , 10000.    , 10000.    ,  9980.    ,
       39300.    , 52930.    , 53200.    , 10600.    , 38300.    ,
       79000.    , 79000.    , 78600.    , 10700.    , 53900.    ,
       76660.    , 37070.    , 75000.    , 54600.    , 36700.    ,
       54990.004 , 73980.    , 35900.    , 11050.    , 56400.    ,
       35400.    , 70000.    , 70000.    , 57130.    , 68700.    ,
       11300.    , 66350.    , 11600.    , 11620.    , 19300.    ,
       61200.    , 18700.    , 61609.996 , 19800.    , 20000.    ,
       20000.    , 33600.    , 18160.    , 18000.    , 12200.    ,
       20950.    , 17300.    , 32800.    , 32710.    , 21900.    ,
       21960.    , 22200.    , 22500.    , 22800.    , 12860.001 ,
       22990.    , 12900.    , 31800.    , 16100.    , 31600.    ,
       31350.    , 15689.999 , 31200.    , 24600.    , 25000.    ,
       25000.    , 25300.    , 15000.    , 15000.    , 14939.999 ,
       25700.    , 14220.    , 14500.    , 26100.    , 30200.    ,
       30000.    , 30000.    , 29600.    , 27300.    , 27520.002 ,
       28000.    ], dtype=float32)

The geoval pressure levels are:

<xarray.DataArray 'air_pressure_levels' (nlocs: 1, ninterfaces: 128)>
array([[9.990000e-01, 1.605000e+00, 2.532000e+00, 3.924000e+00, 5.976000e+00,
        8.947000e+00, 1.317700e+01, 1.909600e+01, 2.724300e+01, 3.827600e+01,
        5.298400e+01, 7.229300e+01, 9.726900e+01, 1.291100e+02, 1.691350e+02,
        2.187670e+02, 2.795060e+02, 3.528940e+02, 4.404810e+02, 5.437820e+02,
        6.642360e+02, 8.031640e+02, 9.617340e+02, 1.140931e+03, 1.341538e+03,
        1.564119e+03, 1.809028e+03, 2.076415e+03, 2.366252e+03, 2.678372e+03,
        3.012510e+03, 3.368363e+03, 3.745646e+03, 4.144164e+03, 4.563881e+03,
        5.004995e+03, 5.468017e+03, 5.953848e+03, 6.463864e+03, 7.000000e+03,
        7.564284e+03, 8.156979e+03, 8.777850e+03, 9.426650e+03, 1.010313e+04,
        1.080702e+04, 1.153803e+04, 1.229586e+04, 1.308015e+04, 1.389054e+04,
        1.472659e+04, 1.558783e+04, 1.647375e+04, 1.738373e+04, 1.831713e+04,
        1.927319e+04, 2.025108e+04, 2.124989e+04, 2.226859e+04, 2.330608e+04,
        2.436125e+04, 2.543350e+04, 2.652221e+04, 2.762667e+04, 2.874602e+04,
        2.987929e+04, 3.102541e+04, 3.218315e+04, 3.335123e+04, 3.452821e+04,
        3.571259e+04, 3.690276e+04, 3.809706e+04, 3.929373e+04, 4.049100e+04,
        4.168702e+04, 4.287996e+04, 4.406793e+04, 4.524909e+04, 4.642161e+04,
        4.758369e+04, 4.873357e+04, 4.986957e+04, 5.099007e+04, 5.209356e+04,
        5.317859e+04, 5.424385e+04, 5.528810e+04, 5.631025e+04, 5.730931e+04,
        5.828441e+04, 5.923481e+04, 6.015988e+04, 6.105911e+04, 6.193211e+04,
        6.277858e+04, 6.359836e+04, 6.439136e+04, 6.515761e+04, 6.589719e+04,
        6.661028e+04, 6.729715e+04, 6.795809e+04, 6.859351e+04, 6.920380e+04,
        6.978947e+04, 7.035099e+04, 7.088892e+04, 7.140383e+04, 7.189629e+04,
        7.236691e+04, 7.281630e+04, 7.324509e+04, 7.365388e+04, 7.404330e+04,
        7.441397e+04, 7.476649e+04, 7.510148e+04, 7.541951e+04, 7.572116e+04,
        7.600699e+04, 7.627755e+04, 7.653341e+04, 7.677520e+04, 7.700355e+04,
        7.721912e+04, 7.742250e+04, 7.761435e+04]], dtype=float32)
ADCollard commented 1 year ago

So with the geoval order fix the agreement is much closer.

The outliers are an artifact of the GSI output - they are single-level obs and all have an error_factor of 1.0 both for GSI and

Screenshot 2023-11-16 at 5 32 22 PM

JEDI.

ADCollard commented 1 year ago

The error_factor that comes out of errmod is still different for some obs

Screenshot 2023-11-16 at 5 46 01 PM

This is for sondes that are duplicated between input sources. GSI treats them separately, but JEDI lumps them all together. It is not clear which is right.

ADCollard commented 12 months ago

The code changes to make ObsErrorFactorConventional behave correctly for both orders of Geovals has been included in UFO PR 3126.

There is apparently a way to reverse the geoval order in YAML, but this (I think) is better as filter was giving wrong results with no clue how to fix it.

ADCollard commented 8 months ago

So an update on the error_factor discrepancy. This is explained by JEDI considering profiles on a BUFR message by BUFR message basis while JEDI considers all observations by the same sonde to be part of the same profile. A number of sonde ascents appear to be reported twice (confusingly often with inconsistent launch times) in the prepbufr file.

The UFO approach is probably the correct one here, but neither is ideal.

ADCollard commented 8 months ago

There appears to be a discrepancy between ObsErrorFactorPressureCheck.cc and the equivalent code in the GSI (e.g., setupw.f90.

In the UFO the pertinent line is:

obserr[iv][iloc] = (currentObserr[iloc]+drpx+1.e6*rhgh+infl_coeff*rlow)
                                /currentObserr[iloc];

in GSI it's

     ratio_errors=error/(data(ier,i)+drpx+1.0e6_r_kind*rhgh+four*rlow)

Note that the UFO has the same error (currentObserr[iloc]) in the numerator and denominator (so if rhgh and rlow are zero, ratio_errors will be 1.0). In GSI data(ier,i) is the original obs error while error is that error after being scaled by errmod. It is not clear which is actually right (I would say the UFO makes most sense) but it is another place where we will not get agreement.

Tagging @BrettHoover-NOAA and @nicholasesposito as this will be relevent to aircraft too.

ADCollard commented 8 months ago

Maybe the best solution here is to re-run the GSI using the UFO methodology so we can get consistency without having to change the UFO codebase?

ADCollard commented 8 months ago

In the sonde YAML from GMAO, they supposedly input the GSI adjusted ob error into the ObsErrorFactorPressureCheck, but on inspection it appears the wrong option is being used. It should be test_obserr not adjusted_error_name (test_obserr maps onto currentObserr[iloc]):

- filter: Perform Action
  filter variables:
  - name: windNorthward
  where:
    - variable:
        name: ObsType/windNorthward
      is_in: 220,221
  action:
    name: inflate error
    inflation variable:
      name: ObsFunction/ObsErrorFactorPressureCheck
      options:
        variable: windNorthward
        inflation factor: 4.0
        #adjusted_error_name: GsiAdjustObsError   # ADC
        test_obserr: GsiAdjustObsError                      # ADC
        geovar_sfc_geomz: surface_geometric_height