Evaluate GeoVaL differences between GSI and JEDI

CoryMartin-NOAA commented 2 years ago

Before we can do sufficient H(x) or QC evaluations, we need to understand the differences in model variables provided to the forward operators.

This is a twofold problem:

GSI uses a Gaussian grid and FV3-JEDI uses a cube sphere grid; what uncertainties come from this difference
What uncertainties come from interpolation methods?

We can mitigate 1 by using GSI in the regional configuration and use RRFS backgrounds to evaluate 2, but we will need to keep differences in mind caused by 1 for all future evaluation work.

CoryMartin-NOAA commented 2 years ago

For some background, see Fan Han's ISDA presentation here: https://docs.google.com/presentation/d/11UZ63r3ZzXB_a7qHIynuibZ-R4b3MZFsyBPPfw2HyQo/edit?usp=sharing

CoryMartin-NOAA commented 2 years ago

GeoVaLs were computed for 9118 locations of AMSU-A N19 observations for one cycle. These plots show air temperature differences between interpolated points from GSI and JEDI. @fmahebert has created a bugfix branch for barycentric interpolation https://github.com/JCSDA-internal/oops/tree/feature/test_interp_barycentric_weight_fix

Total columns Screenshot from 2022-07-22 17-50-21

Largest negative values: develop: -8.459259 fix: -7.945221 Largest positive values: develop: 2.6976929 fix: 2.6607666 Mean: develop: -0.0013460862 fix: -0.0008531928 Std. Dev.: develop: 0.07872462 fix: 0.07337786

Lowest 27 layers only: Screenshot from 2022-07-22 18-09-42

Largest negative values: develop: -8.459259 fix: -7.945221 Largest positive values: develop: 2.3589783 fix: 2.00354 Mean: develop: -0.005181747 fix: -0.0037387458 Std. Dev.: develop: 0.11720554 fix: 0.10920139

It is unreasonable to expect zero differences because of Gaussian vs cube sphere grids. Are these differences acceptable? Is this enough evidence to make a decision or do we need further investigation?

Tagging @ADCollard @emilyhcliu @RussTreadon-NOAA and @dtkleist for awareness/discussion

CoryMartin-NOAA commented 2 years ago

Doing the same for a regional GSI / JEDI comparison (using 3km CONUS RRFS backgrounds) yields similar results. The bugfix is closer to GSI and standard deviations are in the ballpark of ~0.1K

Screenshot from 2022-07-22 19-07-10

fmahebert commented 2 years ago

For completeness' sake, I'll note here that the bugfix branch that Cory is testing in the plots above fixes one of two bugs in the oops interpolator. The spread would hopefully tighten a bit more once the second bug is fixed (estimated timescale: a couple months). However, the fact that the large-error tails are hardly affected by the first bugfix surprises me, and may indicate there's more than interpolation error at play. To be continued...

emilyhcliu commented 2 years ago

Doing the same for a regional GSI / JEDI comparison (using 3km CONUS RRFS backgrounds) yields similar results. The bugfix is closer to GSI and standard deviations are in the ballpark of ~0.1K

@CoryMartin-NOAA Do you have the statistics (larges +- values, mean and std) like those you have for the Global JEDI-GSI comparison?

CoryMartin-NOAA commented 2 years ago

@emilyhcliu sure, here are the numbers for the regional case Largest negative values: develop: -3.0720673 fix: -3.3456573 Largest positive values: develop: 2.3089905 fix: 2.3059387 Mean: develop: -0.0005921195 fix: -0.00051495014 Std. Dev.: develop: 0.10015012 fix: 0.09827528

One thing to note that is not shown here but I want to document, for the regional case, there are several points near/at the edge of the regional model domain that are 0 in JEDI but 'reasonable' in GSI. Not sure why that is, but for these stats, those are removed.

ADCollard commented 2 years ago

Tagging @BrettHoover-NOAA and @HaixiaLiu-NOAA for awareness

CoryMartin-NOAA commented 2 years ago

Running the regional case with inverse distance (the other option already in OOPS) and bilinear (the option added by Fan in https://github.com/JCSDA-internal/oops/pull/1790) gives this histogram:

Screenshot from 2022-08-01 17-49-37

Clearly, the green (bilinear) histogram has many more values near zero.

            bilinear         inverse distance
Min:     -3.1745758          -3.0716705
Max:     1.3336792           2.0160217
Mean:  -0.0003165202    -0.0006362763
Stddev: 0.08704376         0.09464488

Of all 4 methods, the bilinear has the max closest to 0, the mean closest to 0, and the smallest standard deviation, but not the smallest negative difference. This isn't entirely surprising since bilinear is the interpolation method used in GSI. It is surprising, however, that differences this large remain when both systems are using bilinear interpolation.

CoryMartin-NOAA commented 2 years ago

And here is the global counterpart with bilinear and inverse distance:

Screenshot from 2022-08-01 18-12-26

Again, the green (bilinear) histogram is closer to zero.

               bilinear        inverse distance
Min:       -7.581482          -7.945221
Max:      2.7592773           2.6607666
Mean:    -0.0007869504   -0.0008531928
Stddev:  0.066700056       0.07337786

Similarly to the regional case, bilinear has the mean closest to 0 of the 4 tests, and the lowest standard deviation. However, bilinear seems to have the largest positive difference of the four tests, and the smallest negative difference. For both the regional and global cases, the means/standard deviations aren't worrisome, but the outliers, particularly with the regional case, may warrant further investigation.

CoryMartin-NOAA commented 2 years ago

I plotted the difference of GeoVaLs for the bilinear case, with a colorbar of +-0.3 and looped through each model layer with an alpha of 0.01 so that cumulative differences would plot on top of one another. One plot is surface to top, the other is top to surface. Screenshot from 2022-08-01 20-52-49

CoryMartin-NOAA commented 2 years ago

Tagging @frolovsa and @danholdaway for awareness

CoryMartin-NOAA commented 2 years ago

Air temperature is fairly homogenous, especially in the tropics and over water. Here is a plot of humidity mixing ratio (units of g/kg). The differences here for the lowest model layer seem 'random' and non-trivial.

The differences can be as large as +-3 g/kg, the mean is -0.0001 g/kg and the standard deviation is 0.04 g/kg.

CoryMartin-NOAA commented 2 years ago

To test the errors in interpolation to a location, I modified FV3 backgrounds by replacing ua and va with geolon and geolat, respectively. This would mean that eastward (northward) wind would be the value of longitude (latitude) at the interpolated point. I then tried to interpolate for radiosondes (an attempt with satwinds failed, possibly due to too many points) using different methods and subtracted the observation location from the interpolated u,v (lon,lat) values.

Barycentric in develop: Min Mean Max Lon diff -0.22601318 -0.0002460415 0.22294235

Barycentric with @fmahebert fix: Min Mean Max Lon diff -0.014343262 1.10022975e-05 0.014279366

Inverse distance: Min Mean Max Lon diff -0.014343262 1.10022975e-05 0.014279366

Fan Han's bilinear appraoch Min Max Mean Lon diff -0.003364563 2.8866655e-06 0.001876831

This suggests that @fmahebert 's fix causes barycentric to have similar performance as the inverse distance method, but both have min/max errors around ~0.01 degrees longitude, whereas the bilinear approach is an order of magnitude closer to the model grid.

danholdaway commented 2 years ago

Inverse dist and baryfix look like they might be identical. Is that expected? Is bilinear and option in the current develop code?

CoryMartin-NOAA commented 2 years ago

@danholdaway I'm double checking that, I noticed it too, wanted to make sure I didn't copy the wrong thing. Bilinear is not an option but is in https://github.com/JCSDA-internal/oops/pull/1790

CoryMartin-NOAA commented 2 years ago

for some reason @fmahebert 's branch didn't like inverse distance as an option? Here it is in develop: -0.22546387 -0.0002640795 0.22266006 The numbers are actually very similar to the develop barycentric. Do they use the same weights somehow? (they shouldn't...)

danholdaway commented 2 years ago

Could there be a bug? I wouldn't expect them to be identical, which they seem to be.

fmahebert commented 2 years ago

I would suggest not spending too much time evaluating the oops inverse distance and barycentric options, as they are both known to be wrong. Reminder also that my branch is just a partial fix. I'm comforted to see that Fan's branch seems to be performing about an order of magnitude better — this comparable to the improvement that I have been seeing with my own complete (but not ready for external testing) bugfix branch.

CoryMartin-NOAA commented 2 years ago

Ok thanks @fmahebert I will hold off on any additional testing until this is resolved.

NOAA-EMC / JEDI-T2O

Evaluate GeoVaL differences between GSI and JEDI #35