ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
210 stars 121 forks source link

Results of Su et al. (2014) emergent constraints are wrong #3680

Closed schlunma closed 6 days ago

schlunma commented 1 week ago

Describe the bug

See https://github.com/ESMValGroup/ESMValTool/issues/3616#issuecomment-2186780972: the results from the Su et al. (2014) emergent constraint used in recipe_ecs_constraints.yml and recipe_schlund20esd.yml are wrong.

schlunma commented 1 week ago

This is very likely caused by problems in the observational data:

Pre-v2.11.0:

grafik

v2.11.0:

grafik

~This might be caused by https://github.com/ESMValGroup/ESMValTool/pull/3327.~ Will investigate further once Levante is up again.

schlunma commented 1 week ago

All right, I found the problem: this is caused by the valid_range attribute of the AIRS-2-0 model data (/work/bd0854/DATA/ESMValTool2/OBS/Tier1/AIRS-2-0/hur_AIRS-2-0_L3_v2_200209-201105.nc on Levante):

netcdf hur_AIRS-2-0_L3_v2_200209-201105 {
...
float hur(time, plev, lat, lon) ;
    hur:units = "1" ;
    hur:standard_name = "relative_humidity" ;
    hur:long_name = "Relative Humidity" ;
    hur:cell_methods = "time: mean" ;
    hur:valid_range = 0.f, 1.5f ;
    hur:_FillValue = 1.e+20f ;
    hur:missing_value = 1.e+20f ;
...

Since the CMOR units of hur are %, these data are converted to % in our preprocessor. After reloading the data in the diagnostic, basically all values are masked now (most values are outside the valid_range of 0 to 1.5). This leads to the mask that can be seen here.

I have no idea why this only surfaces now. It looks like valid_range is not handled by iris, so I suspect it's a change in netCDF4 that's causing this (though I didn't find anything in their docs regarding that). Removing this attribute fixes this problem and gives the same results as in the pre-v2.11.0 testing.

@ESMValGroup/technical-lead-development-team how should we handle this? I think it would be best to address this in iris (there is already an open discussion about that, but this is open since a year now with no changes). It looks like it's not possible to fix this in the diagnostic since the data is already masked after loading (I also tried a callback, but the field passed to the callback is also already masked). Thus, we can probably only do it in ESMValCore. Safest option would probably be a fix (or in this case, changing the CMORizing script), but it might also be useful to do that in the save function?

valeriupredoi commented 1 week ago

valid_range is a netCDF4 attribute that comes from the point where data is created and written to file, it's part of the data specs https://docs.unidata.ucar.edu/netcdf-c/current/attribute_conventions.html - when data is loaded, the generic netCDF4.Dataset() object is created with a mask around the valid range values, so yes, as you say, Manu, it's the C-layer then python-netcdf4 that is handling that mask, iris simply accepts what it's been dealt to by the underlying netCDF4 library

valeriupredoi commented 1 week ago

What I'd do - I'd keep but change the valid_range during cmorization, so to reflect the scaled values, but removing it it won't hurt at all, so long as we are sure we apply the correct scale factor during cmorization ie we don't get eg sea surface temperatures above 200C :grin:

schlunma commented 6 days ago

Thanks! I will update the CMORizer script and probably remove the attribute after adjusting the mask 👍

schlunma commented 6 days ago

Nevermind, it's an obs4mips dataset, so there is no CMORizer script. I will need to add a fix for this in ESMValCore. Sorry @ehogan, this will be another commit that needs to be cherry-picked 😢

ehogan commented 6 days ago

Nevermind, it's an obs4mips dataset, so there is no CMORizer script. I will need to add a fix for this in ESMValCore. Sorry @ehogan, this will be another commit that needs to be cherry-picked 😢

No problem! 👍