Open ADCollard opened 1 year ago
Differences for sondes
HofX is very close. ~1.e-9 for spec humidity and 1.e-5K for temperature. Some residual error for winds near the surface. Some weird plots for errors. Need to dig further.
Interestingly the discrepancies in the wind hofx do not show up in the histogram plots (too few points?)
Map of hofx differences
Regenerated the input data set from GSI. The agreement is a lot better now for winds but there are still some small discrepancies:
Observation error differences remain (not surprisingly).
Start by focussing on the first couple of tests as a sanity check.
A good start is to understand exactly why this section of the sondes yaml is needed:
## a tempporary solution: Replace error by GsiAdjustObsError
## overwite above error assignments and inflation.
- filter: Perform Action
filter variables:
- name: airTemperature
action:
name: assign error
error function: GsiAdjustObsError/airTemperature
where:
- variable:
name: GsiAdjustObsError/airTemperature
is_defined:
defer to post: true
@ADCollard while running tests using the new GMAO/YAML, I am getting following errors:
All variables / channels are bias-corrected. Error: Variable saturation_specific_humidity not found in sondes_geoval_2021080100.nc4 OOPS Ending 2023-11-02 14:02:29 (UTC-0500) *** Running UFO failed ****
Can you point me to your data files?
@PraveenKumar-NOAA Sorry just saw this. I am using /work2/noaa/da/acollard/UFO_eval/data/gsi_geovals_l127/nofgat_aug2021/20231030
FYI, the relevant issue from GMAO on the sonde.yaml is: https://github.com/GEOS-ESM/swell/issues/78
Looking at the ErrorAdjust level (i.e., after read_prepbufr in GSI), there is a discrepancy with the winds
This appears to be because the ObsErrorFactorConventional obsfunction is not being applied to winds (or anything other than temperature) in this yaml.
`- filter: Perform Action
filter variables:
- name: airTemperature
action:
name: inflate error
inflation variable:
name: ObsFunction/ObsErrorFactorConventional
options:
test QCflag: PreQC
test QCthreshold: 3
inflate variables: [airTemperature]
pressure: MetaData/pressure
distance threshold: -1.
defer to post: true
Trying to work how if it can work with winds.
Adding the following to the YAML improves things:
# Adjusted Error after Initial Assignment
- filter: Perform Action
filter variables:
- name: windEastward
action:
name: inflate error
inflation variable:
name: ObsFunction/ObsErrorFactorConventional
options:
test QCflag: PreQC
test QCthreshold: 3
inflate variables: [windEastward]
pressure: MetaData/pressure
distance threshold: -1.
defer to post: true
- filter: Perform Action
filter variables:
- name: windNorthward
action:
name: inflate error
inflation variable:
name: ObsFunction/ObsErrorFactorConventional
options:
test QCflag: PreQC
test QCthreshold: 3
inflate variables: [windNorthward]
pressure: MetaData/pressure
distance threshold: -1.
defer to post: true
But differences are still significant:
To try to understand this a little better let's compare the errorfactor derived in ObsFunction/ObsErrorFactorConventional
with the equivalent in GSI. There are substantial differences:
We need to dig deeper to find out why this is.
The 'vmag' calculation definitely appears to differ.
vmag=min(max(half*dpres(ilev),r0_02*presl(ilev)),con*plevs(k))
So vmag appears to agree between GSI and JEDI up to a value of ~0.2. Above that the JEDI value is up to three times larger.
Digging in a little closer. I have chosen a point where vmag is 0.75 for JEDI and 1.54 for GSI.
This is 35.27285N, 248.181152E, 37070Pa. The sonde ID is 72376.
The observed pressures for this sonde are:
In [90]: p_sonde
Out[90]:
array([ 880. , 900. , 1000. , 1000. , 1050. ,
1170. , 1310. , 1330. , 1350. , 1470. ,
1540. , 1850. , 1890. , 1940. , 2000. ,
2000. , 2029.9999, 2220. , 2340. , 2520. ,
2680. , 2890. , 2950. , 3000. , 3000. ,
3240.0002, 3400. , 3730. , 3770. , 3909.9998,
3950. , 4100. , 4310. , 4520. , 4540. ,
4670. , 4740. , 5000. , 5000. , 5230. ,
5390. , 5490. , 5760. , 6000. , 6050. ,
6680.0005, 7000. , 7000. , 7019.9995, 7100. ,
7300. , 7380.0005, 7760. , 7890. , 8160. ,
8290. , 46200. , 45700. , 44700. , 47600. ,
44200. , 43520. , 43500. , 48600. , 48960. ,
42800. , 49900. , 50000. , 50000. , 41840. ,
9020. , 50900. , 9260. , 40210. , 40000. ,
40000. , 9640. , 10000. , 10000. , 9980. ,
39300. , 52930. , 53200. , 10600. , 38300. ,
79000. , 79000. , 78600. , 10700. , 53900. ,
76660. , 37070. , 75000. , 54600. , 36700. ,
54990.004 , 73980. , 35900. , 11050. , 56400. ,
35400. , 70000. , 70000. , 57130. , 68700. ,
11300. , 66350. , 11600. , 11620. , 19300. ,
61200. , 18700. , 61609.996 , 19800. , 20000. ,
20000. , 33600. , 18160. , 18000. , 12200. ,
20950. , 17300. , 32800. , 32710. , 21900. ,
21960. , 22200. , 22500. , 22800. , 12860.001 ,
22990. , 12900. , 31800. , 16100. , 31600. ,
31350. , 15689.999 , 31200. , 24600. , 25000. ,
25000. , 25300. , 15000. , 15000. , 14939.999 ,
25700. , 14220. , 14500. , 26100. , 30200. ,
30000. , 30000. , 29600. , 27300. , 27520.002 ,
28000. ], dtype=float32)
The geoval pressure levels are:
<xarray.DataArray 'air_pressure_levels' (nlocs: 1, ninterfaces: 128)>
array([[9.990000e-01, 1.605000e+00, 2.532000e+00, 3.924000e+00, 5.976000e+00,
8.947000e+00, 1.317700e+01, 1.909600e+01, 2.724300e+01, 3.827600e+01,
5.298400e+01, 7.229300e+01, 9.726900e+01, 1.291100e+02, 1.691350e+02,
2.187670e+02, 2.795060e+02, 3.528940e+02, 4.404810e+02, 5.437820e+02,
6.642360e+02, 8.031640e+02, 9.617340e+02, 1.140931e+03, 1.341538e+03,
1.564119e+03, 1.809028e+03, 2.076415e+03, 2.366252e+03, 2.678372e+03,
3.012510e+03, 3.368363e+03, 3.745646e+03, 4.144164e+03, 4.563881e+03,
5.004995e+03, 5.468017e+03, 5.953848e+03, 6.463864e+03, 7.000000e+03,
7.564284e+03, 8.156979e+03, 8.777850e+03, 9.426650e+03, 1.010313e+04,
1.080702e+04, 1.153803e+04, 1.229586e+04, 1.308015e+04, 1.389054e+04,
1.472659e+04, 1.558783e+04, 1.647375e+04, 1.738373e+04, 1.831713e+04,
1.927319e+04, 2.025108e+04, 2.124989e+04, 2.226859e+04, 2.330608e+04,
2.436125e+04, 2.543350e+04, 2.652221e+04, 2.762667e+04, 2.874602e+04,
2.987929e+04, 3.102541e+04, 3.218315e+04, 3.335123e+04, 3.452821e+04,
3.571259e+04, 3.690276e+04, 3.809706e+04, 3.929373e+04, 4.049100e+04,
4.168702e+04, 4.287996e+04, 4.406793e+04, 4.524909e+04, 4.642161e+04,
4.758369e+04, 4.873357e+04, 4.986957e+04, 5.099007e+04, 5.209356e+04,
5.317859e+04, 5.424385e+04, 5.528810e+04, 5.631025e+04, 5.730931e+04,
5.828441e+04, 5.923481e+04, 6.015988e+04, 6.105911e+04, 6.193211e+04,
6.277858e+04, 6.359836e+04, 6.439136e+04, 6.515761e+04, 6.589719e+04,
6.661028e+04, 6.729715e+04, 6.795809e+04, 6.859351e+04, 6.920380e+04,
6.978947e+04, 7.035099e+04, 7.088892e+04, 7.140383e+04, 7.189629e+04,
7.236691e+04, 7.281630e+04, 7.324509e+04, 7.365388e+04, 7.404330e+04,
7.441397e+04, 7.476649e+04, 7.510148e+04, 7.541951e+04, 7.572116e+04,
7.600699e+04, 7.627755e+04, 7.653341e+04, 7.677520e+04, 7.700355e+04,
7.721912e+04, 7.742250e+04, 7.761435e+04]], dtype=float32)
So with the geoval order fix the agreement is much closer.
The outliers are an artifact of the GSI output - they are single-level obs and all have an error_factor of 1.0 both for GSI and
JEDI.
The error_factor that comes out of errmod is still different for some obs
This is for sondes that are duplicated between input sources. GSI treats them separately, but JEDI lumps them all together. It is not clear which is right.
The code changes to make ObsErrorFactorConventional behave correctly for both orders of Geovals has been included in UFO PR 3126.
There is apparently a way to reverse the geoval order in YAML, but this (I think) is better as filter was giving wrong results with no clue how to fix it.
So an update on the error_factor
discrepancy. This is explained by JEDI considering profiles on a BUFR message by BUFR message basis while JEDI considers all observations by the same sonde to be part of the same profile. A number of sonde ascents appear to be reported twice (confusingly often with inconsistent launch times) in the prepbufr
file.
The UFO approach is probably the correct one here, but neither is ideal.
There appears to be a discrepancy between ObsErrorFactorPressureCheck.cc and the equivalent code in the GSI (e.g., setupw.f90.
In the UFO the pertinent line is:
obserr[iv][iloc] = (currentObserr[iloc]+drpx+1.e6*rhgh+infl_coeff*rlow)
/currentObserr[iloc];
in GSI it's
ratio_errors=error/(data(ier,i)+drpx+1.0e6_r_kind*rhgh+four*rlow)
Note that the UFO has the same error (currentObserr[iloc]
) in the numerator and denominator (so if rhgh
and rlow
are zero, ratio_errors
will be 1.0). In GSI data(ier,i)
is the original obs error while error
is that error after being scaled by errmod
. It is not clear which is actually right (I would say the UFO makes most sense) but it is another place where we will not get agreement.
Tagging @BrettHoover-NOAA and @nicholasesposito as this will be relevent to aircraft too.
Maybe the best solution here is to re-run the GSI using the UFO methodology so we can get consistency without having to change the UFO codebase?
In the sonde YAML from GMAO, they supposedly input the GSI adjusted ob error into the ObsErrorFactorPressureCheck
, but on inspection it appears the wrong option is being used. It should be test_obserr
not adjusted_error_name
(test_obserr
maps onto currentObserr[iloc]
):
- filter: Perform Action
filter variables:
- name: windNorthward
where:
- variable:
name: ObsType/windNorthward
is_in: 220,221
action:
name: inflate error
inflation variable:
name: ObsFunction/ObsErrorFactorPressureCheck
options:
variable: windNorthward
inflation factor: 4.0
#adjusted_error_name: GsiAdjustObsError # ADC
test_obserr: GsiAdjustObsError # ADC
geovar_sfc_geomz: surface_geometric_height
This issue is splitting from #60 so the sonde and aircraft validation does not get confused.
The starting point for this is the GMAO YAML, but there are many issues with these it appears.
The test YAMLs used here are in this branch.
This issue concerns temperature, humidity and wind only as surface pressure is being addressed here.