Open delippi opened 4 months ago
I've finished hofx validation for mesonet airTemperature. In this example, I'm using a version of GSI (from Andrew) that writes out GSI geovals and a modified version of the ufo-vertInterp code (based on a branch from Dan who modified the gnssro operator to be able to read in GSI geovals using the fv3jedi executable--I did the same thing but for the vertInterp operator). I also had to write a short python script to update my GSI-geovals files to have saturation_specific_humidity and surface_geometric_height. The former was calculated from Rogers and Yau eq 2.19. The latter just needed to grab only the lowest model level for surface_geometric_height from the geometric_height field. The following is the result for airTemperature hofx validation. More work will be done on other variables. Then once all hofx validation is done I will move to observation error and other QC checks.
If the GSI-geovals are not read in this is the following result. I'm thinking that this could be considered as "close enough" when NOT using GSI-geovals. I believe the differences are due to how GSI and JEDI compute geovals (i.e., 2D interpolation using 3 vs 4 points for example). I have an issue looking into this.
Finished the hofx validation for remaining variables (using vertInterp) specificHumidity, windEastward, and windNorthward. Once again, these results are using GSI-geovals feed into vertInterp operator and differences will still remain due to how GSI and JEDI calculate geovals.
I've started looking into ob error inflation. Here is the result from a full mesonet observation test. The right plot shows the GSI vs JEDI final oberror ratios. There are clear differences that need to be understood. The hofx plot on the left shows one-to-one comparison since the geovals are fed directly into ufo's vertical interpolation operator. The final ob errors do not match because the ufo obfilters such as ObsFunction/ObsErrorFactorPressureCheck
is not also being fed the geovals from GSI. I checked this via another single-ob test using a different observation than what I used before.
Here is the ob I chose and the resulting initial ob error comparison.
gsi_errf=39.8781
jedi_errf=34.974
Here are the corresponding GSI-geovals for this single ob.
netcdf sfc_tsen_geoval_2022052619 {
dimensions:
nlocs = 1 ;
nlevs = 60 ;
ninterfaces = 61 ;
Station_ID_maxstrlen = 8 ;
Observation_Class_maxstrlen = 7 ;
variables:
float latitude(nlocs) ;
float longitude(nlocs) ;
float time(nlocs) ;
float surface_pressure(nlocs) ;
float air_pressure(nlocs, nlevs) ;
float air_pressure_levels(nlocs, ninterfaces) ;
float air_temperature(nlocs, nlevs) ;
float virtual_temperature(nlocs, nlevs) ;
float specific_humidity(nlocs, nlevs) ;
float eastward_wind(nlocs, nlevs) ;
float northward_wind(nlocs, nlevs) ;
float geopotential_height(nlocs, nlevs) ;
float geometric_height(nlocs, nlevs) ;
float surface_geometric_height(nlocs) ;
float saturation_specific_humidity(nlocs, nlevs) ;
// global attributes:
:date_time = 2022052619 ;
data:
latitude = 29.15 ;
longitude = 256.47 ;
time = 0 ;
surface_pressure = 91893.05 ;
air_pressure =
755.7465, 1868.049, 2983.181, 4102.358, 5226.591, 6361.438, 7513.978,
8686.738, 9889.828, 11128.3, 12402.16, 13726.57, 15106.59, 16562.43,
18134.53, 19868.4, 21799.41, 23937.67, 26313.52, 28925.12, 31707.87,
34570.54, 37438.94, 40291.35, 43141.63, 45998.63, 48856.96, 51705.39,
54530.09, 57326.04, 60082.09, 62777.85, 65405.58, 67935.79, 70321.66,
72533.61, 74554.91, 76385.96, 78018.52, 79461.47, 80736.48, 81847.89,
82821.54, 83662, 84386.59, 85038.16, 85638.18, 86208.16, 86752.34,
87270.73, 87767.65, 88247.36, 88709.94, 89159.62, 89600.63, 90033.09,
90456.94, 90872.23, 91282.16, 91689.8 ;
air_pressure_levels =
200, 1311.493, 2424.604, 3541.759, 4662.958, 5790.223, 6932.653, 8095.303,
9278.173, 10501.48, 11755.12, 13049.2, 14403.94, 15809.23, 17315.62,
18953.44, 20783.35, 22815.46, 25059.88, 27567.16, 30283.08, 33132.66,
36008.42, 38869.46, 41713.24, 44570.03, 47427.24, 50286.68, 53124.12,
55936.06, 58716.03, 61448.16, 64107.56, 66703.62, 69167.98, 71475.37,
73591.88, 75517.96, 77253.98, 78783.09, 80139.88, 81333.11, 82362.7,
83280.41, 84043.62, 84729.59, 85346.77, 85929.63, 86486.71, 87017.99,
87523.51, 88011.82, 88482.95, 88936.98, 89382.3, 89819.02, 90247.2,
90666.71, 91077.78, 91486.59, 91893.05 ;
air_temperature =
235.6375, 224.4905, 218.1675, 213.6949, 213.5903, 207.6426, 206.0895,
206.2977, 203.9101, 204.1167, 205.3741, 207.642, 209.8925, 212.0992,
214.099, 216.5952, 220.5284, 224.5905, 228.8093, 233.3199, 237.7421,
242.3809, 246.8677, 251.4358, 255.7671, 259.6447, 262.6363, 265.6216,
268.7856, 271.9664, 274.8885, 277.6014, 280.2744, 282.9365, 285.4703,
287.8022, 289.9375, 291.8065, 293.3411, 294.455, 295.3679, 296.2541,
296.9786, 297.6199, 298.2021, 298.7821, 299.3857, 299.9604, 300.5107,
301.0414, 301.5591, 302.0586, 302.5493, 303.0456, 303.5226, 304.0048,
304.5028, 304.9882, 305.4874, 306.0442 ;
specific_humidity =
2.752339e-06, 1.904394e-06, 1.394001e-06, 1.379674e-06, 1.379306e-06,
1.377589e-06, 1.40844e-06, 1.582917e-06, 2.397518e-06, 2.761124e-06,
3.063641e-06, 4.004941e-06, 4.907784e-06, 8.561648e-06, 1.400903e-05,
1.929839e-05, 2.306499e-05, 3.07411e-05, 4.232386e-05, 7.573712e-05,
0.0001071723, 0.0001724478, 0.0002271507, 0.0003030314, 0.0004094222,
0.000541823, 0.0007843238, 0.001047134, 0.001294734, 0.00150681,
0.001731838, 0.001989752, 0.002196813, 0.002296577, 0.002383559,
0.002425554, 0.002479532, 0.002559666, 0.002792936, 0.003182035,
0.003523703, 0.003710348, 0.00390177, 0.004104795, 0.00441022,
0.004679783, 0.004720944, 0.004746933, 0.004768888, 0.004791364,
0.004813664, 0.00483504, 0.00485517, 0.004874029, 0.00489172,
0.004908095, 0.004923194, 0.004936983, 0.004949572, 0.004962991 ;
eastward_wind =
-4.228357, -6.544117, -12.40301, -9.521055, -5.278649, 1.093383,
-0.5375162, 1.721064, 10.03384, 10.48232, 10.83364, 10.69437, 10.9804,
13.03079, 13.88336, 14.23912, 12.26383, 11.20725, 10.65507, 12.56708,
14.49977, 14.68172, 14.83068, 14.79433, 13.5991, 12.19977, 10.78136,
9.972265, 9.590845, 8.495372, 6.849806, 4.988925, 3.511991, 2.999388,
3.143361, 3.677207, 4.232866, 4.222813, 3.626519, 2.296938, 1.214826,
-0.06948268, -0.4343772, -0.8267059, -1.111862, -1.280351, -1.400046,
-1.501276, -1.597076, -1.693949, -1.787644, -1.875156, -1.954497,
-2.024871, -2.085754, -2.135643, -2.173452, -2.197542, -2.205784,
-2.191741 ;
northward_wind =
4.067732, 3.094453, -0.6086304, -0.1323414, -4.160838, 3.078696, -1.005045,
-3.330433, -1.419531, 0.4915086, 0.8171092, 1.352675, 1.592034, 1.097219,
-0.9191248, -2.726938, -3.137551, -2.894401, -3.198701, -3.485875,
-3.607549, -3.811813, -4.590881, -5.260763, -6.225089, -6.378666,
-5.142704, -5.012124, -5.243576, -4.273788, -2.753191, -1.622213,
-0.8835961, -0.4733567, -0.291191, -0.1172846, 0.2097148, 0.4525892,
0.6018959, 0.7836246, 0.8502474, 0.7726977, 0.7246647, 0.6420889,
0.5781096, 0.5359891, 0.5019259, 0.4698849, 0.4376185, 0.4039004,
0.3701546, 0.337393, 0.3064236, 0.2775157, 0.2508291, 0.2268822,
0.2059942, 0.1886475, 0.1752818, 0.1653261 ;
geopotential_height =
32284.62, 26192.84, 23161.41, 21148.63, 19634.61, 18423.68, 17415.81,
16540.8, 15762.37, 15058.07, 14408.79, 13795.71, 13210.52, 12642.5,
12077.08, 11501.71, 10908.54, 10299.19, 9671.478, 9031.684, 8398.622,
7791.421, 7220.816, 6685.446, 6178.142, 5694.484, 5233.662, 4795.486,
4379.327, 3983.411, 3607.365, 3252.2, 2917.101, 2603.931, 2316.487,
2056.362, 1823.697, 1616.88, 1435.549, 1277.67, 1140.03, 1021.432,
918.5604, 830.5157, 755.1499, 687.7866, 626.0804, 567.7474, 512.3081,
459.7226, 409.521, 361.2435, 314.8624, 269.9324, 226.0159, 183.0937,
141.157, 100.1908, 59.8703, 19.88624 ;
geometric_height =
32507.72, 26348.37, 23287.75, 21257.21, 19730.69, 18510.29, 17494.88,
16613.6, 15829.79, 15120.79, 14467.32, 13850.4, 13261.67, 12690.31,
12121.66, 11543.12, 10946.79, 10334.31, 9703.49, 9060.663, 8424.727,
7814.888, 7241.911, 6704.41, 6195.17, 5709.742, 5247.304, 4807.652,
4390.149, 3993.005, 3615.839, 3259.656, 2923.635, 2609.633, 2321.455,
2060.687, 1827.466, 1620.168, 1438.428, 1280.2, 1142.262, 1023.414,
920.3271, 832.1015, 756.5828, 689.0844, 627.2556, 568.8079, 513.2605,
460.5734, 410.2756, 361.9065, 315.4379, 270.4239, 226.4259, 183.4246,
141.4111, 100.3705, 59.97732, 19.92166 ;
surface_geometric_height = 19.92166 ;
saturation_specific_humidity =
0.02014038, 0.002466431, 0.0007363904, 0.0003075175, 0.0002381839,
8.956026e-05, 6.128983e-05, 5.456117e-05, 3.432543e-05, 3.14106e-05,
3.362379e-05, 4.150242e-05, 5.099787e-05, 6.208447e-05, 7.322237e-05,
9.125465e-05, 0.0001336413, 0.0001946684, 0.0002826511, 0.0004149006,
0.000592959, 0.0008536862, 0.001197073, 0.001672704, 0.002265279,
0.002928171, 0.003505401, 0.004184011, 0.005049723, 0.006083087,
0.007171612, 0.008316158, 0.00960482, 0.01107564, 0.01266047, 0.01428834,
0.0159377, 0.01750122, 0.01885235, 0.01982511, 0.02063227, 0.02147753,
0.02217406, 0.02281329, 0.02341854, 0.02405554, 0.02475703, 0.02544167,
0.02611313, 0.02677693, 0.02744124, 0.02809555, 0.02875415, 0.02943988,
0.03010899, 0.03080327, 0.03154269, 0.03227707, 0.0330539, 0.03395744 ;
and here are the JEDI-geovals:
netcdf msonet_geovals_rrfs_2022052619_0000 {
dimensions:
nlocs = 1 ;
air_temperature_nval = 60 ;
air_pressure_nval = 60 ;
geopotential_height_nval = 60 ;
saturation_specific_humidity_nval = 60 ;
surface_pressure_nval = 1 ;
surface_geometric_height_nval = 1 ;
variables:
float air_temperature(nlocs, air_temperature_nval) ;
float air_pressure(nlocs, air_pressure_nval) ;
float geopotential_height(nlocs, geopotential_height_nval) ;
float saturation_specific_humidity(nlocs, saturation_specific_humidity_nval) ;
float surface_pressure(nlocs, surface_pressure_nval) ;
float surface_geometric_height(nlocs, surface_geometric_height_nval) ;
data:
air_temperature =
235.6378, 224.4909, 218.1918, 213.7335, 213.6359, 207.7357, 206.1277,
206.3607, 203.9762, 204.1579, 205.3943, 207.6636, 209.9156, 212.1392,
214.1317, 216.6082, 220.5322, 224.6023, 228.813, 233.3283, 237.7528,
242.3918, 246.8813, 251.4621, 255.8076, 259.6915, 262.6952, 265.6933,
268.8654, 272.048, 274.9818, 277.7316, 280.4175, 283.087, 285.6224,
287.9653, 290.1075, 291.9906, 293.5446, 294.6653, 295.5721, 296.4574,
297.18, 297.8099, 298.3815, 298.9362, 299.5392, 300.1133, 300.6627,
301.1918, 301.7127, 302.2111, 302.7009, 303.2041, 303.6779, 304.1591,
304.6568, 305.1499, 305.6472, 306.2061 ;
air_pressure =
700.493, 1847.963, 2970.646, 4093.206, 5219.338, 6355.322, 7508.618,
8681.94, 9885.321, 11124.1, 12398.14, 13722.59, 15102.7, 16558.35,
18130.13, 19863.38, 21793.77, 23931.41, 26306.41, 28917.7, 31701.63,
34568.21, 37443.17, 40304.14, 43164.25, 46032.07, 48901.81, 51762.08,
54598.77, 57406.79, 60174.9, 62882.62, 65522.05, 68063.74, 70460.52,
72682.67, 74713.31, 76552.84, 78192.98, 79642.59, 80923.48, 82040.03,
83018.16, 83862.51, 84590.41, 85244.96, 85847.72, 86420.3, 86966.96,
87487.73, 87986.91, 88468.83, 88933.52, 89385.25, 89828.29, 90262.72,
90688.51, 91105.7, 91517.52, 91927.03 ;
geopotential_height =
33616.32, 27083.71, 24008.19, 21981.85, 20461.66, 19247.24, 18237.16,
17360.63, 16581.07, 15875.85, 15225.9, 14612.32, 14026.65, 13458.26,
12892.5, 12316.91, 11723.53, 11113.95, 10486.03, 9845.886, 9212.146,
8603.776, 8031.633, 7494.526, 6985.426, 6499.972, 6037.421, 5597.591,
5179.88, 4782.513, 4405.11, 4048.662, 3712.362, 3398.068, 3109.607,
2848.569, 2615.098, 2407.571, 2225.616, 2067.198, 1929.095, 1810.105,
1706.9, 1618.571, 1542.97, 1475.401, 1413.509, 1355.001, 1299.396,
1246.655, 1196.305, 1147.885, 1101.367, 1056.305, 1012.26, 969.2122,
927.1532, 886.0674, 845.6292, 805.5286 ;
saturation_specific_humidity =
0.02165116, 0.002489244, 0.0007417696, 0.0003102094, 0.0002402886,
9.107734e-05, 6.185361e-05, 5.524064e-05, 3.477191e-05, 3.170728e-05,
3.383402e-05, 4.175642e-05, 5.129251e-05, 6.253931e-05, 7.364197e-05,
9.148115e-05, 0.00013366, 0.0001946587, 0.000282107, 0.0004140031,
0.0005914937, 0.00085131, 0.001193862, 0.001670138, 0.002264834,
0.002929631, 0.003511235, 0.004195979, 0.005068264, 0.006107367,
0.007207532, 0.008380654, 0.009688043, 0.01117592, 0.01277464,
0.01442475, 0.01609402, 0.01768549, 0.01907377, 0.02006679, 0.02087597,
0.02172871, 0.02242968, 0.02306024, 0.02365805, 0.02426763, 0.02497263,
0.02566025, 0.02633366, 0.02699843, 0.0276711, 0.02832667, 0.02898693,
0.02968713, 0.03035376, 0.03104919, 0.03179094, 0.03254208, 0.03331833,
0.03422884 ;
surface_pressure =
92131.25 ;
surface_geometric_height =
785.5793 ;
}
and the obs_pressure = 93670
so the ob is located below the ground.
I noticed two things in ObsErrorFactorPressureCheck.cc
:
model_pressure_sfc[iloc]=92131.25
which happens to be the JEDI-geoval surface pressure value. So I hardcoded it as model_pressure_sfc[iloc]=91893.05;
to be the GSI-geoval.prsl[0][iloc]=91927
and prsl[1][iloc]=91517.5
which are also the JEDI-geoval air_pressure values. So I hardcoded them as prsl[0][iloc]=91689.8;
and prsl[1][iloc]=91282.16;
to be the GSI-geovals (remember reverse order here). The other levels don't matter here since the ob is actually located below the model bottom.Now the oberrors match exactly (within rounding error at least) for this case. I've attached the final ob error comparison figure below. I'm not sure how how have feed obserror function read the GSI-geovals directly in order to do a full ob test. Perhaps it is good enough to do no more than a few single ob cases where we know the ob error differences are large and hard code the GSI-geovals.
TLDR; differences in how GSI and JEDI create geovals are the root cause of the differences found in both hofx and ob error validation. When the GSI-geovals are fed into JEDI the results match exactly. JEDI's method is NOT necessarily wrong it is just different. The JEDI results need not necessarily replicate GSI, but if they do not we must understand the differences as shown here.
Doing the same thing for q and uv; hard coding the GSI-geovals in ObsFunction/ObsErrorFactorPressureCheck
for a single ob test.
Here are the results when using JEDI-geovals:
Here are the results when using GSI-geovals:
Working on hofx and oberror for the last variable type: stationPressure. This one uses Identity operator instead of vertInterp like the other variables. I'm looking at the single ob result from the case with initial large mismatch in ob errors for other variable types. Here is the initial result when looking at omf, hofx, oberror, and analysis increments. The large hofx and thus large omf results in the non-zero increments. The GSI-hofx is from GsiHofXBc.
We can plot the GsiHofX (i.e., non-bias corrected) and we have the following result. We can ignore that the increment is still zero because that isn't the hofx used in the GSI analysis, however, the hofx shown in the bottom left more closely aligns with the result from JEDI. The differences at this point I'm going to attribute likely due to differences in geovals (see #54). The operator used here is the Identity so the hofx are essentially just geovals. Now I need to figure out how to include the bias correction that GSI is using.
Actually, mesonet (188) stationPressure shouldn't use Identity operator. It should use:
obs operator:
name: SfcPCorrected
da_psfc_scheme: GSI
geovar_sfc_geomz: surface_geopotential_height
geovar_geomz: geopotential_height
linear obs operator:
name: Identity
Here is the result when changing the obs operator. There are still differences, but again, I think those are simply due to differences in how GSI and JEDI compute geovals. I'm uncertain why there is zero increment in GSI though.
Assimilation of all mesonet observations in MPAS-JEDI case (mpas_2024052700). This is the result when NOT using obs error inflation with ObsErrorFactorPressureCheck. When this obsFunction is turned on I get an error like:
26: --> vars_nlevels: saturation_specific_humidity not available in MPAS domain or self % templated_fields
Phase 2 Validation (no reliance on any GSI-derived IODA or GeoVaLs) initial results.
Methodology:
Run bufr2ioda:
Run GSI:
Run FV3-JEDI:
Summary: For the most part, the increments and observation counts look reasonable! I do need to figure out why the uv_288 plot has so many obs and zero increment. Initial investigations have shown that the ObsError is maxed out. More to look into there. Overall, not bad for initial testing and almost no modifications from what is currently in RDASApp!
msonet_uv_288 update I did fix a small error in the yaml causing the timeOffset filter check to fail. Furthermore, I removed a defer to post statment on both ObsError inflation which seemed to fix the problem of setting ob errors very large. Aside from these two changes, the main difference now is that a mesonetuselist and mesonet_stnuselist have been implemented based on the use list in GSI. The ob counts and analysis increments look much more similar now.
msonet_uv_288 update I did fix a small error in the yaml causing the timeOffset filter check to fail. Furthermore, I removed a defer to post statment on both ObsError inflation which seemed to fix the problem of setting ob errors very large. Aside from these two changes, the main difference now is that a mesonetuselist and mesonet_stnuselist have been implemented based on the use list in GSI. The ob counts and analysis increments look much more similar now.
Great! Will you commit those changes soon? Thanks!
@guoqing-noaa, I don't think it will be too much longer. I wanted to do one PR for all updates for mesonet from this Phase 2 testing, if possible. I'm pretty happy with airTemperature and specificHumidity (basically no change). I wanted to see if I could get the ob count closer for uv (like how close the counts are in T or q). I also wanted to dig into the ob count discrepencies for stationPressure which seems larger than it should be (GSI has 8000 more obs).
If yamls are needed sooner for a cycling test, then I can provide intermediate updates. Just let me know.
Another Phase 2 update: I added several new QC filters to each of the mesonet yamls. airTemperature looks even closer than it did before. I added all the same QC checks to specificHumidity and while some locations look better others do not. I could look into this more, but it looks really good already. StationPressure and uv both look really good now with a few extra QC filters trying to mimic GSI.
Another phase 2 update: Now with each individual tests complete for mesonet, I ran the same test but assimilated all mesonet observations in GSI and FV3-JEDI to compare. This also checks that catting yamls doesn't change anything. Here is a printout of the ob counts from GSI and JEDI and broken down into the each obtype. These numbers match the numbers in my previous comment
gsi_nobs: 84305
ps: 24234
uv: 8844
t: 26459
q: 24768
jedi_nobs: 83006
ps: 23789
uv: 7610
t: 26572
q: 25035
Overall, msonet DA in FV3-JEDI is able to match GSI results within an acceptable range. In these "Phase 2" tests, there is no dependence on GSI (except as the baseline comparison) with differences attributable to underlying system differences; therefore, I would consider msonet DA to have passed Phase 2 testing and ready to be updated in RDASApp. There is still Phase 3 testing for validating MPAS-JEDI against FV3-JEDI (which we don't have a case with matching dates) and to check that all QC filters work in MPAS-JEDI as they do for FV3-JEDI.
Another Phase 2 update:
I previously had marked the phase 2 work as being completed, however, I found some issues related to ObsErrorFactorPressureCheck (https://github.com/JCSDA-internal/ufo/issues/3496) for mesonet winds (see that discussion in the issue). I didn't have this problem before because I was incorrectly using defer to post
in my yaml for the initial ob error assignment which effectively negated all the ob error inflation (i.e., ObsErrorFactorPressureCheck). It turned out that the un-inflated oberror worked pretty well, but is unfortuanetly incorrect. This also led to some further investigate of the use of defer to post
in my other yamls as well. I've removed all instance of this as I don't believe it is actually needed anywhere. I think I've got all of these to match GSI fairly close since we know we can't exactly replicate GSI. Perhaps close enough?
Here are the results from my latest phase 2 tests:
The UFO in JEDI is the component that not only computes model-simulated observations, but also houses filters and methods for observation QC, ob errors, and bias correction. The GSI observer is the equivalent. Many forward operators for various observations have been developed for the UFO. These operators can be utilized in RDAS. However these operators still must be tested for RDAS. The steps for transition by observation platform are as follows:
Mesonet data is bufr obtype=188/288 (and also 195/295 but not sure if that data is/will be used).
Edit: A link to my GSI fork/branch based on Andrew and Emily's ufo_geovals branch. I added some extra geovals for output. https://github.com/delippi/GSI/tree/feature/ufo_geovals_rdas