Open CoryMartin-NOAA opened 1 year ago
GeoVaLs from GSI will need to be produced. Starting with a fresh version of develop
, some changes to write out GeoVaLs have been made in this branch: https://github.com/CoryMartin-NOAA/GSI/tree/ufo_geovals
Tagging @frolovsa for awareness as this will be one of the first ob types we will want to attempt to include in prototype cycling.
First look at one cycle of Ps H(x). The points in the upper left where UFO computes something closer to 1000 hPa and GSI says a much lower value were investigated. These points are where the station elevation/height is reported to be 9999. UFO interprets that as missing properly, GSI does not, so it applies a pressure correction of 10 km, making these obs ~300 hPa instead of 1000 hPa. This is wrong. So these points will need filtered before further evaluation can happen.
Using the identity operator and looking at the uncorrected GSI values we get this:
I believe this to be ~ machine precision, it fits into three lines, and the largest are +- 0.008 Pa difference, which is good enough for me. This suggests that there is in fact something wrong with the GSI Ps correction scheme.
The combine GSI ncdiag script was not properly combining surface T and Ps obs. This is/was because T is at 2m and Ps was assumed to be at the surface. Once fixing that, here is the result of UFO vs GSI after making some changes to the GSI surface P correction scheme (will be added to a future UFO PR)
The differences are within 1 hPa, which, while decent, should be better when using the same GeoVaLs.
Investigation of the GSI code found that in setupps, "geometric" height is used for the surface, and not "geopotential" height, but the output was called "geopotential height". The SfcPCorrected operator will convert to geometric from geopotential if need be, but this caused differences then in the height of the model surfaces. Simply renaming the geoval file variable to "surface_geometric_height" and changing the input YAML to geovar_sfc_geomz: surface_geometric_height
we now get this:
All but a handful of points are within 1 Pa, and the largest is ~30 Pa (just a few points). I think this can be considered validated once the converter is modified to write surface_geometric_height
and the SfcPCorrected
operator is updated for the GSI scheme. (Note I will double check 'develop' with the updated files to ensure changes are warranted after this investigation)
Thank you @CoryMartin-NOAA for working on this. I am very encouraged by this results (even if i don't follow all the details). As the final check. Can we also plot jedi-omb vs gsi-omb. i find that omb/omb plots provide more resolution with measurements that otherwise might have a very large dynamic range (like surface pressure).
Here you go @frolovsa
This is with GSI "GeoVaLs", so the fact that it is basically perfect, is to be expected. But this confirms the Ps correction is working as intended.
I known. totally looks perfect. the best one i seen so far :) Do you plan to produce one with JEDI interpolation? I am curious to see at least one plot where we can see how much omb changes if introduce JEDI interpolation/native grid to the mix. I guess that will be difficult to do until jedi interpolation is fixed? Sorry for the segway. What is the status of the JEDI interpolation at the moment?
@frolovsa yes that is the plan, but it is on hold until Francois can update/fix the JEDI interpolation. In the meantime, we can verify the QC/operartors are working as intended.
can we still plot it with the existing (bad) interpolation. I wonder how much does the existing problems with interpolation will affect ombs.
Yeah I can probably do that tomorrow. I'll make a mental note.
Here is a comparison of results from UFO develop vs a fix to the GSI correction method: Develop
Fix (not yet in branch)
Even with the few outliers, the largest difference in the fix is 30 Pa and almost all are within 1 Pa, whereas the develop branch has a lot of spread within 100 Pa. I'm not 100% sure that the GSI method is the 'best', but I think we need to reproduce it faithfully. If @ADCollard and @emilyhcliu agree, I will issue a UFO PR to make some changes to Fan's code to reproduce GSI results.
@frolovsa here are the figures you requested. I think any difference in Ps beyond 0.5 hPa should be fairly rare.
@CoryMartin-NOAA thank you. A couple of clarifying questions.
Both figures in the latest post are using FV3 backgrounds, not geovals. Given that, I think differences of almost all within 0.5 hPa is reasonable. Both are the same run with the bug fix included.
Thanks for doing this, @CoryMartin-NOAA. Please go ahead with the PR.
Very cool @CoryMartin-NOAA . Let us know when you think we should re-run surface pressure experiment. Either with the feature branch or with develop (once merged) your call.
@frolovsa I have not done any QC or obs error analysis yet. Do you want to wait for that? Or do you want to just try it and see what happens?
I have began applying QC filters to the sfc.yaml
and was able to filter out a majority of the bad data. Here is the JEDI h(x) vs. GSI h(x) scatter plot similar to the first image Cory posted:
Some of the filters were applied from the existing UFO sfc.yaml here. Some other filters were applied by adding domain filters to the height data providing a min and max threshold and only using defined data. Another domain filter was applied for the ObsError to have a max value of 10,000.
Here are the filters:
obs filters:
- filter: Bounds Check
filter variables:
- name: surface_pressure
minvalue: 37499
maxvalue: 106999
action:
name: reject
- filter: Background Check
filter variables:
- name: surface_pressure
threshold: 3.6
absolute threshold: 990.0
action:
name: reject
- filter: Domain Check
where:
- variable:
name: height@MetaData
minvalue: -400
maxvalue: 8000
action:
name: reject
- variable:
name: height@MetaData
is_defined:
- filter: Domain Check
where:
- variable:
name: surface_pressure@ObsError
maxvalue: 10000
There are ~1800 values in this specific file that have an EffectiveQC
set to 0 (passed QC) where GsiEffectiveQC
equals 1. The breakdown of the PreQC values breaks down as follows:
PreQC 8: 275
PreQC 14: 306
PreQC 15: 1230
The description of each PreQC can be found here.
Something we have discussed, but has not yet been resolved is what do with the values with PreQC equal to 8. We have identified that most of these values have BC differences in the GSI >100, however the JEDI value is close to the ObsValue
. GSI would reject these values, however the question to be answered is should JEDI be rejecting these as well.
PreQC 14 comes from blacklisted values so should be easy to apply a filter to those.
PreQC 15 is another tricky one as it is described as "flagged for non-use by the analysis", so we will need to determine how to work with this.
Below is the progress on surface ships surface pressure QC filters. The existing QC filters in UFO were included and covered most of the QC. An additional filter was added to reject all data where ObsType
is 183. The data are rejected in GSI and a majority had very large EffectiveErrors
. Here are the filters used:
obs filters:
# Observation Range Sanity Check
- filter: Bounds Check
filter variables:
- name: surface_pressure
minvalue: 37499
maxvalue: 106999
action:
name: reject
# Gross error check with (O - B) / ObsError greater than threshold.
- filter: Background Check
filter variables:
- name: surface_pressure
threshold: 3.6
absolute threshold: 990.0
action:
name: reject
defer to post: true
# Reject all obs types that are 183
- filter: BlackList
where:
- variable:
name: surface_pressure@ObsType
is_in: 183
After applying these filters, only 248 values remain that have an EffectiveQC
set to 0 (passed QC) where GsiEffectiveQC
equals 1. The breakdown of the PreQC values breaks down as follows:
PreQC 14: 188
PreQC 15: 60
Similar to the above comment, PreQC
14 should be easy to apply a filter and PreQC
15 might get a little tricky.
Here is the JEDI h(x) vs. GSI h(x):
Here is the update on surface ships air temperature QC filters. Similar to the comment above, The existing QC filters in UFO were included and covered most of the QC and a third filter was added to reject all ObsType
equal to 183 as these are all rejected by GSI. The yaml and filters used are here:
obs space:
name: sfcship
obsdatain:
engine:
type: H5File
obsfile: sfcship_obs_$(OBS_DATE).nc4
obsdataout:
engine:
type: H5File
obsfile: sfcship_diag_$(OBS_DATE).nc4
overwrite: true
_source: ldm
simulated variables: [surface_pressure, air_temperature]
geovals:
filename: sfcship_geoval_$(OBS_DATE).nc4
obs operator:
name: Composite
components:
- name: VertInterp
variables:
- name: air_temperature
- name: SfcPCorrected
variables:
- name: surface_pressure
da_psfc_scheme: GSI
geovar_sfc_geomz: surface_geometric_height
geovar_geomz: geopotential_height
obs filters:
# Bounds check for sanity
- filter: Bounds Check
filter variables:
- name: air_temperature
minvalue: 195
maxvalue: 327
action:
name: reject
# Gross error check
- filter: Background Check
filter variables:
- name: air_temperature
threshold: 7.0
absolute threshold: 9.0
action:
name: reject
defer to post: true
# Reject all ObsType 183
- filter: BlackList
where:
- variable:
name: surface_pressure@ObsType
is_in: 183
After the above filters were applied, 355 values remain that have an EffectiveQC
set to 0 (passed QC) where GsiEffectiveQC
equals 1. The breakdown of the PreQC
values breaks down as follows:
PreQC 8: 1
PreQC 14: 327
PreQC 15: 27
The single PreQC
8 finding is an observation from a single station off the coast of Crete with an air_pressure@MetaData
value of 87500 for multiple cycles. The station id is 6101009.
Here is the JEDI h(x) vs. the GSI h(x):
Here is the update for surface ships for specific humidity. All that was added was a simple bounds check and reject where ObsType
is 183 and it nearly removed all data that was not being assimilated by GSI. Here is the yaml:
obs space:
name: sfcship
obsdatain:
engine:
type: H5File
obsfile: sfcship_obs_$(OBS_DATE).nc4
obsdataout:
engine:
type: H5File
obsfile: sfcship_diag_$(OBS_DATE).nc4
overwrite: true
_source: ldm
simulated variables: [surface_pressure, air_temperature, specific_humidity]
geovals:
filename: sfcship_geoval_$(OBS_DATE).nc4
obs operator:
name: Composite
components:
- name: VertInterp
variables:
- name: air_temperature
- name: specific_humidity
- name: SfcPCorrected
variables:
- name: surface_pressure
da_psfc_scheme: GSI
geovar_sfc_geomz: surface_geometric_height
geovar_geomz: geopotential_height
obs filters:
# Observation range sanity check
- filter: Bounds Check
filter variables:
- name: specific_humidity
minvalue: 0
maxvalue: 0.03499
action:
name: reject
# Reject all ObsType 183
- filter: BlackList
where:
- variable:
name: surface_pressure@ObsType
is_in: 183
Only 9 values were still being accepted by JEDI where PreQC
was 15.
Here is the JEDI h(x) vs. GSI h(x):
@kevindougherty-noaa hofx differences look non-trivial here for humidity. Is this due to vertical interpolation? Perhaps there is something we need to add to make this closer? Can you look at a plot of the differences?
@CoryMartin-NOAA here is the JEDI omb vs. GSI omb:
@CoryMartin-NOAA My apologies, I misunderstood your request. Here is the h(x) differences:
@kevindougherty-noaa thanks, I am a bit concerned about these points that pass QC (red) but are relatively large differences (the ones that are ~ -0.001). Can you take a look and see if there is anything in common with these points besides GSI h(x) value?
After turning off the duplicate error inflation in GSI and nvqc, I am able to reproduce the errors in UFO compared to GSI (save for one point).:
This YAML will be committed to GDASApp soon, and a PR will be opened/updated in UFO to reflect these changes.
QC counts are identical for this cycle between GSI and UFO with the QC filters in YAML
Need to validate surface pressure observations from radiosondes, sfc, and sfcships (others?)
This issue will track progress on validation of surface pressure observations in UFO for GDAS in four stages:
Note this will need to include any pressure correction applied to the observation/hofx value.