NOAA-EMC / JEDI-T2O

JEDI Transition to Operations activities.
GNU Lesser General Public License v2.1
5 stars 4 forks source link

Surface Pressure Obs Validation #36

Open CoryMartin-NOAA opened 1 year ago

CoryMartin-NOAA commented 1 year ago

Need to validate surface pressure observations from radiosondes, sfc, and sfcships (others?)

This issue will track progress on validation of surface pressure observations in UFO for GDAS in four stages:

Note this will need to include any pressure correction applied to the observation/hofx value.

CoryMartin-NOAA commented 1 year ago

GeoVaLs from GSI will need to be produced. Starting with a fresh version of develop, some changes to write out GeoVaLs have been made in this branch: https://github.com/CoryMartin-NOAA/GSI/tree/ufo_geovals

CoryMartin-NOAA commented 1 year ago

Tagging @frolovsa for awareness as this will be one of the first ob types we will want to attempt to include in prototype cycling.

CoryMartin-NOAA commented 1 year ago
Screen Shot 2022-08-08 at 5 06 14 PM

First look at one cycle of Ps H(x). The points in the upper left where UFO computes something closer to 1000 hPa and GSI says a much lower value were investigated. These points are where the station elevation/height is reported to be 9999. UFO interprets that as missing properly, GSI does not, so it applies a pressure correction of 10 km, making these obs ~300 hPa instead of 1000 hPa. This is wrong. So these points will need filtered before further evaluation can happen.

CoryMartin-NOAA commented 1 year ago

Using the identity operator and looking at the uncorrected GSI values we get this:

Screen Shot 2022-08-09 at 4 18 14 PM

I believe this to be ~ machine precision, it fits into three lines, and the largest are +- 0.008 Pa difference, which is good enough for me. This suggests that there is in fact something wrong with the GSI Ps correction scheme.

CoryMartin-NOAA commented 1 year ago

The combine GSI ncdiag script was not properly combining surface T and Ps obs. This is/was because T is at 2m and Ps was assumed to be at the surface. Once fixing that, here is the result of UFO vs GSI after making some changes to the GSI surface P correction scheme (will be added to a future UFO PR)

Screen Shot 2022-08-11 at 2 30 06 PM

The differences are within 1 hPa, which, while decent, should be better when using the same GeoVaLs.

Investigation of the GSI code found that in setupps, "geometric" height is used for the surface, and not "geopotential" height, but the output was called "geopotential height". The SfcPCorrected operator will convert to geometric from geopotential if need be, but this caused differences then in the height of the model surfaces. Simply renaming the geoval file variable to "surface_geometric_height" and changing the input YAML to geovar_sfc_geomz: surface_geometric_height we now get this:

Screen Shot 2022-08-11 at 2 30 40 PM

All but a handful of points are within 1 Pa, and the largest is ~30 Pa (just a few points). I think this can be considered validated once the converter is modified to write surface_geometric_height and the SfcPCorrected operator is updated for the GSI scheme. (Note I will double check 'develop' with the updated files to ensure changes are warranted after this investigation)

frolovsa commented 1 year ago

Thank you @CoryMartin-NOAA for working on this. I am very encouraged by this results (even if i don't follow all the details). As the final check. Can we also plot jedi-omb vs gsi-omb. i find that omb/omb plots provide more resolution with measurements that otherwise might have a very large dynamic range (like surface pressure).

CoryMartin-NOAA commented 1 year ago

Here you go @frolovsa

Screen Shot 2022-08-11 at 3 41 37 PM

This is with GSI "GeoVaLs", so the fact that it is basically perfect, is to be expected. But this confirms the Ps correction is working as intended.

frolovsa commented 1 year ago

I known. totally looks perfect. the best one i seen so far :) Do you plan to produce one with JEDI interpolation? I am curious to see at least one plot where we can see how much omb changes if introduce JEDI interpolation/native grid to the mix. I guess that will be difficult to do until jedi interpolation is fixed? Sorry for the segway. What is the status of the JEDI interpolation at the moment?

CoryMartin-NOAA commented 1 year ago

@frolovsa yes that is the plan, but it is on hold until Francois can update/fix the JEDI interpolation. In the meantime, we can verify the QC/operartors are working as intended.

frolovsa commented 1 year ago

can we still plot it with the existing (bad) interpolation. I wonder how much does the existing problems with interpolation will affect ombs.

CoryMartin-NOAA commented 1 year ago

Yeah I can probably do that tomorrow. I'll make a mental note.

CoryMartin-NOAA commented 1 year ago

Here is a comparison of results from UFO develop vs a fix to the GSI correction method: Develop

Screen Shot 2022-08-11 at 5 12 44 PM

Fix (not yet in branch)

Screen Shot 2022-08-11 at 2 30 40 PM

Even with the few outliers, the largest difference in the fix is 30 Pa and almost all are within 1 Pa, whereas the develop branch has a lot of spread within 100 Pa. I'm not 100% sure that the GSI method is the 'best', but I think we need to reproduce it faithfully. If @ADCollard and @emilyhcliu agree, I will issue a UFO PR to make some changes to Fan's code to reproduce GSI results.

CoryMartin-NOAA commented 1 year ago

@frolovsa here are the figures you requested. I think any difference in Ps beyond 0.5 hPa should be fairly rare.

Screen Shot 2022-08-12 at 1 40 24 PM Screen Shot 2022-08-12 at 1 41 20 PM
frolovsa commented 1 year ago

@CoryMartin-NOAA thank you. A couple of clarifying questions.

CoryMartin-NOAA commented 1 year ago

Both figures in the latest post are using FV3 backgrounds, not geovals. Given that, I think differences of almost all within 0.5 hPa is reasonable. Both are the same run with the bug fix included.

ADCollard commented 1 year ago

Thanks for doing this, @CoryMartin-NOAA. Please go ahead with the PR.

frolovsa commented 1 year ago

Very cool @CoryMartin-NOAA . Let us know when you think we should re-run surface pressure experiment. Either with the feature branch or with develop (once merged) your call.

CoryMartin-NOAA commented 1 year ago

@frolovsa I have not done any QC or obs error analysis yet. Do you want to wait for that? Or do you want to just try it and see what happens?

kevindougherty-noaa commented 1 year ago

I have began applying QC filters to the sfc.yaml and was able to filter out a majority of the bad data. Here is the JEDI h(x) vs. GSI h(x) scatter plot similar to the first image Cory posted:

image

Some of the filters were applied from the existing UFO sfc.yaml here. Some other filters were applied by adding domain filters to the height data providing a min and max threshold and only using defined data. Another domain filter was applied for the ObsError to have a max value of 10,000.

Here are the filters:

obs filters:
  - filter: Bounds Check
    filter variables:
    - name: surface_pressure
    minvalue: 37499
    maxvalue: 106999
    action:
      name: reject

  - filter: Background Check
    filter variables:
    - name: surface_pressure
    threshold: 3.6
    absolute threshold: 990.0
    action:
      name: reject

  - filter: Domain Check
    where:
    - variable:
        name: height@MetaData
      minvalue: -400
      maxvalue: 8000
      action:
        name: reject
    - variable:
        name: height@MetaData
      is_defined:

  - filter: Domain Check
    where:
    - variable:
        name: surface_pressure@ObsError
      maxvalue: 10000

There are ~1800 values in this specific file that have an EffectiveQC set to 0 (passed QC) where GsiEffectiveQC equals 1. The breakdown of the PreQC values breaks down as follows:

PreQC 8: 275
PreQC 14: 306
PreQC 15: 1230

The description of each PreQC can be found here.

Something we have discussed, but has not yet been resolved is what do with the values with PreQC equal to 8. We have identified that most of these values have BC differences in the GSI >100, however the JEDI value is close to the ObsValue. GSI would reject these values, however the question to be answered is should JEDI be rejecting these as well.

PreQC 14 comes from blacklisted values so should be easy to apply a filter to those.

PreQC 15 is another tricky one as it is described as "flagged for non-use by the analysis", so we will need to determine how to work with this.

kevindougherty-noaa commented 1 year ago

Below is the progress on surface ships surface pressure QC filters. The existing QC filters in UFO were included and covered most of the QC. An additional filter was added to reject all data where ObsType is 183. The data are rejected in GSI and a majority had very large EffectiveErrors. Here are the filters used:

obs filters:
  # Observation Range Sanity Check
  - filter: Bounds Check
    filter variables:
    - name: surface_pressure
    minvalue: 37499
    maxvalue: 106999
    action:
      name: reject

  # Gross error check with (O - B) / ObsError greater than threshold.
  - filter: Background Check
    filter variables:
    - name: surface_pressure
    threshold: 3.6
    absolute threshold: 990.0
    action:
      name: reject
    defer to post: true

  # Reject all obs types that are 183
  - filter: BlackList
    where:
    - variable:
        name: surface_pressure@ObsType
      is_in: 183

After applying these filters, only 248 values remain that have an EffectiveQC set to 0 (passed QC) where GsiEffectiveQC equals 1. The breakdown of the PreQC values breaks down as follows:

PreQC 14: 188
PreQC 15: 60

Similar to the above comment, PreQC 14 should be easy to apply a filter and PreQC 15 might get a little tricky.

Here is the JEDI h(x) vs. GSI h(x): image

kevindougherty-noaa commented 1 year ago

Here is the update on surface ships air temperature QC filters. Similar to the comment above, The existing QC filters in UFO were included and covered most of the QC and a third filter was added to reject all ObsType equal to 183 as these are all rejected by GSI. The yaml and filters used are here:

obs space:
  name: sfcship
  obsdatain:
    engine:
      type: H5File
      obsfile: sfcship_obs_$(OBS_DATE).nc4
  obsdataout:
    engine:
      type: H5File
      obsfile: sfcship_diag_$(OBS_DATE).nc4
      overwrite: true
  _source: ldm
  simulated variables: [surface_pressure, air_temperature]
geovals:
  filename: sfcship_geoval_$(OBS_DATE).nc4
obs operator:
  name: Composite
  components:
    - name: VertInterp
      variables:
      - name: air_temperature
    - name: SfcPCorrected
      variables:
      - name: surface_pressure
      da_psfc_scheme: GSI
      geovar_sfc_geomz: surface_geometric_height
      geovar_geomz: geopotential_height
obs filters:
  # Bounds check for sanity
  - filter: Bounds Check
    filter variables:
    - name: air_temperature
    minvalue: 195
    maxvalue: 327
    action:
      name: reject

  # Gross error check
  - filter: Background Check
    filter variables:
    - name: air_temperature
    threshold: 7.0
    absolute threshold: 9.0
    action:
      name: reject
    defer to post: true
  # Reject all ObsType 183
  - filter: BlackList
    where:
    - variable:
        name: surface_pressure@ObsType
      is_in: 183

After the above filters were applied, 355 values remain that have an EffectiveQC set to 0 (passed QC) where GsiEffectiveQC equals 1. The breakdown of the PreQC values breaks down as follows:

PreQC 8: 1
PreQC 14: 327
PreQC 15: 27

The single PreQC 8 finding is an observation from a single station off the coast of Crete with an air_pressure@MetaData value of 87500 for multiple cycles. The station id is 6101009.

Here is the JEDI h(x) vs. the GSI h(x): image

kevindougherty-noaa commented 1 year ago

Here is the update for surface ships for specific humidity. All that was added was a simple bounds check and reject where ObsType is 183 and it nearly removed all data that was not being assimilated by GSI. Here is the yaml:

obs space:
  name: sfcship
  obsdatain:
    engine:
      type: H5File
      obsfile: sfcship_obs_$(OBS_DATE).nc4
  obsdataout:
    engine:
      type: H5File
      obsfile: sfcship_diag_$(OBS_DATE).nc4
      overwrite: true
  _source: ldm
  simulated variables: [surface_pressure, air_temperature, specific_humidity]
geovals:
  filename: sfcship_geoval_$(OBS_DATE).nc4
obs operator:
  name: Composite
  components:
    - name: VertInterp
      variables:
      - name: air_temperature
      - name: specific_humidity
    - name: SfcPCorrected
      variables:
      - name: surface_pressure
      da_psfc_scheme: GSI
      geovar_sfc_geomz: surface_geometric_height
      geovar_geomz: geopotential_height

obs filters:
  # Observation range sanity check
  - filter: Bounds Check
    filter variables:
    - name: specific_humidity
    minvalue: 0
    maxvalue: 0.03499
    action:
      name: reject

  # Reject all ObsType 183
  - filter: BlackList
    where:
    - variable:
        name: surface_pressure@ObsType
      is_in: 183

Only 9 values were still being accepted by JEDI where PreQC was 15.

Here is the JEDI h(x) vs. GSI h(x): image

CoryMartin-NOAA commented 1 year ago

@kevindougherty-noaa hofx differences look non-trivial here for humidity. Is this due to vertical interpolation? Perhaps there is something we need to add to make this closer? Can you look at a plot of the differences?

kevindougherty-noaa commented 1 year ago

@CoryMartin-NOAA here is the JEDI omb vs. GSI omb:

image

kevindougherty-noaa commented 1 year ago

@CoryMartin-NOAA My apologies, I misunderstood your request. Here is the h(x) differences:

image

CoryMartin-NOAA commented 1 year ago

@kevindougherty-noaa thanks, I am a bit concerned about these points that pass QC (red) but are relatively large differences (the ones that are ~ -0.001). Can you take a look and see if there is anything in common with these points besides GSI h(x) value?

CoryMartin-NOAA commented 1 year ago

After turning off the duplicate error inflation in GSI and nvqc, I am able to reproduce the errors in UFO compared to GSI (save for one point).:

Image

This YAML will be committed to GDASApp soon, and a PR will be opened/updated in UFO to reflect these changes.

CoryMartin-NOAA commented 1 year ago

QC counts are identical for this cycle between GSI and UFO with the QC filters in YAML

Image