Closed pochedls closed 1 year ago
For a subset of models and domains, the CDAT-xCDAT differences were traced to CDAT dropping the model's latitude bounds (and then replacing the bounds with its own calculated bounds). This resulted in different weights and small differences in the area averages (see table at bottom).
First – an example of incorrect bounds:
Excerpt of lat_bnds
from ncdump -v lat_bnds /p/css03/cmip5_css02/data/cmip5/output1/BNU/BNU-ESM/historical/mon/atmos/Amon/r1i1p1/tas/1/tas_Amon_BNU-ESM_historical_r1i1p1_185001-200512.nc
:
lat_bnds = -87.8638000488281, -86.4801635742188, -86.4801635742188, -83.704719543457, -83.704719543457, -80.9192581176758, -80.9192581176758, -78.1312561035156, -78.1312561035156, -75.3422088623047, -75.3422088623047, -72.5526351928711,
These appear to match the bounds in cdscan. But they are different from those loaded using cdms2:
fn = '/p/user_pub/xclim/CMIP5/CMIP/historical/atmos/mon/tas/CMIP5.CMIP.historical.BNU.BNU-ESM.r1i1p1.mon.tas.atmos.glb-z1-gu.1.0000000.0.xml'
f = cdms2.open(fn)
tas = f('tas')
print(tas.getLatitude().getBounds())
[[-89.24743652 -86.48016358]
[-86.48016358 -83.70471955] [-83.70471955 -80.91925812] [-80.91925812 -78.13125229] [-78.13125229 -75.34220887] [-75.34220887 -72.55263519]
If I drop the lat_bnds
in xcdat and re-calculate them using .add_missing_bounds()
the spatial average differences are order 10-10. This resolves 15 / 28 CDAT-xCDAT differences. I think CDAT probably drops the latitude bounds systematically, but it only matters in these cases. xCDAT does not drop the model latitude bounds (and I don't think it should – we should trust the model data and the user should decide to drop this data).
model | domain | max absolute difference |
---|---|---|
BNU-ESM | global | 0.0337137471188953 |
BNU-ESM | NAM | 0.03921949346317888 |
BNU-ESM | SAM | 0.057204666153268136 |
FIO-ESM | global | 0.033644891283586276 |
FIO-ESM | NAM | 0.03955337948826809 |
FIO-ESM | SAM | 0.0555397851591124 |
bcc-csm1-1 | global | 0.03277778145468346 |
bcc-csm1-1 | NAM | 0.04099915332000137 |
bcc-csm1-1 | SAM | 0.056736843962255534 |
bcc-csm1-1-m | global | 0.005671941710033934 |
bcc-csm1-1-m | NAM | 0.006484684673694119 |
bcc-csm1-1-m | SAM | 0.009858142884070276 |
fio-esm | global | 0.033644891283586276 |
fio-esm | NAM | 0.03955337948826809 |
fio-esm | SAM | 0.0555397851591124 |
In another small set of cases (see table below), small differences seem to be introduced by the precision of the axis information.
For example: ncdump -h /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r10i1p1f1/Amon/tas/gr/v20180803/tas_Amon_IPSL-CM6A-LR_historical_r10i1p1f1_gr_185001-201412.nc | grep lat
The lat
axis is stored as a float
(i.e., float32
not double
). CDAT (type(tas.getLatitude().getBounds()[0, 0])
) reads this as numpy.float64
whereas xcdat reads (type(ds.lat_bnds.values[0, 0])
) this (properly) as numpy.float32
. These datasets do not have bounds (so they are calculated by both CDAT and xCDAT). The difference in precision introduces numerical roundoff (10-11 to 10-8) differences in the bounds, then the weights (10-11 0 10-8), then the spatial averages.
If I cast the lat
and lon
axes to float64
in the xcdat
dataset, recompute the bounds, and the spatial average, then xcdat and CDAT match. This resolves 6 of the 28 differences. This means 21 of 28 of the differences are resolved.
In the past (and repeated here), this issue had been resolved with my own averaging calculations. We could not figure out why my calculations differed from xarray
, though we did realize that modifying the precision mattered. I think the custom spatial averager likely worked because it was probably casting the arrays to the same type which led to np.allclose
resulting in True. I haven't tried to figure out exactly where the difference happens (it seems to stem from CDAT reading in the bounds with the wrong precision, which affects downstream calculations).
model | domain | max absolute difference |
---|---|---|
IPSL-CM6A-LR | SAM | 0.003096939779936747 |
IPSL-CM6A-LR | AllM | 0.007789411802434643 |
IPSL-CM6A-LR-INCA | global | 0.0037057772779576226 |
IPSL-CM6A-LR-INCA | SAM | 0.003101609951784212 |
IPSL-CM6A-LR-INCA | AllM | 0.007794648910362412 |
IPSL-CM6A-LR | global | 0.0037208329281384067 |
One set of cases that I don't fully understand is with CESM2-WACCM (see differences). The bounds and weights appear to be identical in CDAT and xCDAT. The differences are "all close" if I either a) manually calculate the spatial average (10-5 differences), b) reverse the axis order when averaging in xcdat (10-3 differences), or drop and recompute the latitude bounds (and weights) (10-7 differences).
Unlike with IPSL data (above) the axes seem to be stored (and opened) with double
precision. The bounds are stored as float
in the netCDF file. Both xcdat and CDAT bounds/weights seem to open as float32
. But if I drop the lat_bnds
and re-create them in xcdat, they are float64
and I get very close to the CDAT spatial averages (10-7 differences). So I think this is another precision/roundoff type difference.
If we accept this as an explanation, this resolves 4/28 differences (3/28 left to figure out).
model | domain | max absolute difference |
---|---|---|
CESM2-WACCM | tropics | 0.004078462585937359 |
CESM2-WACCM | global | 0.006396168779986056 |
CESM2-WACCM | SAM | 0.0029629751547304295 |
CESM2-WACCM | AllM | 0.008556697618530507 |
The final set of differences (see table below) appear to be related to incorrect longitude weights assigned by CDAT in rare instances.
For example, for the Sahel domain (/p/css03/esgf_publish/CMIP6/CMIP/NUIST/NESM3/historical/r1i1p1f1/Amon/tas/gn/v20190630/
) which has the bounds 13 - 18oN and -10 to 10oE, CDAT gives the following relative weights:
lon_bnd | relative weight |
---|---|
[-9.375 -7.5 ] | 1.875 |
[-7.5 -5.625] | 1.875 |
[-5.625 -3.75 ] | 1.875 |
[-3.75 -1.875] | 1.875 |
[-1.875 0. ] | 1.875 |
[0. 1.875] | 1.875 |
[1.875 3.75 ] | 1.875 |
[3.75 5.625] | 1.875 |
[5.625 7.5 ] | 1.875 |
[7.5 9.375] | 1.875 |
[ 9.375 10. ] | 0.625 |
It correctly adjusts the bound of the lat grid cell (from 9.375 : 11.25 to 9.375 : 10.), but it drops weight from the grid cell spanning (-11.25 : -9.375) [it isn't listed]. xCDAT does weight this grid cell:
lon_bnd | relative weight |
---|---|
[-11.25 -9.375] | 0.625 |
[-9.375 -7.5 ] | 1.875 |
[-7.5 -5.625] | 1.875 |
[-5.625 -3.75 ] | 1.875 |
[-3.75 -1.875] | 1.875 |
[-1.875 0. ] | 1.875 |
[0. 1.875] | 1.875 |
[1.875 3.75 ] | 1.875 |
[3.75 5.625] | 1.875 |
[5.625 7.5 ] | 1.875 |
[7.5 9.375] | 1.875 |
[ 9.375 11.25 ] | 0.625 |
Note the partial weights in the first and last grid cells. I don't know why CDAT doesn't provide the correct weights in these rare instances. This resolves the final 3 CDAT / xCDAT discrepancies.
model | domain | max absolute difference |
---|---|---|
NESM3 | Sahel | 0.15666759125781482 |
NESM3 | GoG | 0.036151854691411245 |
NESM3 | SAmo | 0.10544204167973703 |
The issue that prompted the PR has to do with an E3SM dataset with unusual bounds.
The first set of bounds span 359.6484375 to 0.3515625. This appears to be inconsistent with CF-conventions (see, e.g., here) which states that bounds should be increasing. We don't have examples of bounds like this in the CMIP archive. If bounds wrap around the prime meridian they are stored in increasing order, e.g., [ -0.9375, 0.9375]
for /p/css03/esgf_publish/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/
.
The weird E3SM bounds are not handled correctly by the spatial averager in v0.5.0
. This issue is caught and fixed in the current PR.
Thanks for documenting the 28 cases where you found differences in the spatial averaging results between CDAT and xCDAT. I was able to follow your findings for these cases and am satisfied with the explanations.
One set of cases that I don't fully understand is with CESM2-WACCM (see differences). The bounds and weights appear to be identical in CDAT and xCDAT. The differences are "all close" if I either a) manually calculate the spatial average (10-5 differences), b) reverse the axis order when averaging in xcdat (10-3 differences), or drop and recompute the latitude bounds (and weights) (10-7 differences).
Unlike with IPSL data (above) the axes seem to be stored (and opened) with double precision. The bounds are stored as float in the netCDF file. Both xcdat and CDAT bounds/weights seem to open as float32. But if I drop the lat_bnds and re-create them in xcdat, they are float64 and I get very close to the CDAT spatial averages (10-7 differences). So I think this is another precision/roundoff type difference.
If we accept this as an explanation, this resolves 4/28 differences (3/28 left to figure out).
As mentioned in this comment, we found that CDAT incorrectly type casts weights to np.float64
even though bounds in the input dataset are np.float32
(source notebook). There might be some subtle type casting going on resulting in precision differences. We probably shouldn't be worried here.
I am also happy with where things are. I think this can be closed with #495.
A recent issue prompted me to look through (and update) the spatial averaging routine.
I validated this PR by looking at monthly CMIP5/6 historical tas files (1 per model for 117 climate models) and 21 averaging domains (I skipped NAFM due to this discussion).
In validating the proposed changes, I have (again) come across CDAT-xCDAT differences. In 28 cases of 2415 cases (1.1%) the spatial averages are different (as judged by
np.allclose()
). The monthly differences are mostly below 0.01 K, but exceed 0.1 (in rare cases). These results apply to both the current release v0.5.0 and the proposed PR.I plan to summarize the reasons for the differences in this issue.