PCMDI / cmor

Climate Model Output Rewriter
BSD 3-Clause "New" or "Revised" License
51 stars 32 forks source link

avoid attributes of bounds of auxilliary coordinates (`vertices_latitudes` / `vertices_longitude`) #729

Open larsbuntemeyer opened 8 months ago

larsbuntemeyer commented 8 months ago

For example in MPI ocean model grids, there is an auxilliary 2D coordinate longitude / latitude that has bounds vertices_longitude / vertices_latitude with attributes units, missing_value and _FillValue:

variables:
    double time(time) ;
        time:bounds = "time_bnds" ;
        time:units = "days since 1850-1-1 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;
        time:axis = "T" ;
        time:long_name = "time" ;
        time:standard_name = "time" ;
    double time_bnds(time, bnds) ;
    int j(j) ;
        j:units = "1" ;
        j:long_name = "cell index along second dimension" ;
    int i(i) ;
        i:units = "1" ;
        i:long_name = "cell index along first dimension" ;
    double latitude(j, i) ;
        latitude:standard_name = "latitude" ;
        latitude:long_name = "latitude" ;
        latitude:units = "degrees_north" ;
        latitude:missing_value = 1.e+20 ;
        latitude:_FillValue = 1.e+20 ;
        latitude:bounds = "vertices_latitude" ;
    double longitude(j, i) ;
        longitude:standard_name = "longitude" ;
        longitude:long_name = "longitude" ;
        longitude:units = "degrees_east" ;
        longitude:missing_value = 1.e+20 ;
        longitude:_FillValue = 1.e+20 ;
        longitude:bounds = "vertices_longitude" ;
    double vertices_latitude(j, i, vertices) ;
        vertices_latitude:units = "degrees_north" ;
        vertices_latitude:missing_value = 1.e+20 ;
        vertices_latitude:_FillValue = 1.e+20 ;
    double vertices_longitude(j, i, vertices) ;
        vertices_longitude:units = "degrees_east" ;
        vertices_longitude:missing_value = 1.e+20 ;
        vertices_longitude:_FillValue = 1.e+20 ;
    float tos(time, j, i) ;
        tos:standard_name = "sea_surface_temperature" ;
        tos:long_name = "Sea Surface Temperature" ;
        tos:comment = "Temperature of upper boundary of the liquid ocean, including temperatures below sea-ice and floating ice shelves." ;
        tos:units = "degC" ;
        tos:original_name = "tos" ;
        tos:cell_methods = "area: mean where sea time: mean" ;
        tos:cell_measures = "area: areacello" ;
        tos:history = "2019-09-11T14:21:40Z altered by CMOR: replaced missing value flag (-9e+33) and corresponding data with standard missing value (1e+20)." ;
        tos:missing_value = 1.e+20f ;
        tos:_FillValue = 1.e+20f ;
        tos:coordinates = "latitude longitude" ;

It seems that those bounds shouldn't have any attributes, at least, according to CF conventions. For upcoming CORDEX cmor tables, we will heavily use auxilliary coordinates. My issue is: does the cmor API, e.g., for cmor_grid allow to avoid writing missing_value and _FillValue. Even if i remove the units attribute, it will write some original_units attribute.

matthew-mizielinski commented 8 months ago

@larsbuntemeyer my reading of the CF conventions section 7.1 on cell boundaries includes

A boundary variable inherits the values of some attributes from its parent coordinate variable. If a coordinate variable has any of the attributes marked "BI" (for "inherit") in the "Use" column of Appendix A, Attributes, they are assumed to apply to its bounds variable as well. It is recommended that BI attributes not be included on a boundary variable. If a BI attribute is included, it must also be present in the parent variable, and it must exactly match the parent attribute’s data type and value. A bounds variable may have any of the attributes marked "BO" for ("own") in the "Use" column of Appendix A, Attributes. These attributes take precedence over any corresponding attributes of the parent variable. In these cases, the parent variable’s attribute does not apply to the bounds variable, regardless of whether the latter has its own attribute.

The _FillValue and missing_value attributes are marked BO in the table linked in appendix A, so may be included. units is marked BI so provided this attribute is identical in the vertices_ variables and their "parent" coordinate then this is acceptable. As such I think that the CMOR output is compliant, but @taylor13 might be able to be more definitive.

I note that the examples in the CF conventions documentation are fairly minimal in the attributes included for the bounds variables. It might be useful if there were some more expansive examples, but I can see that it would also be valuable to keep that document concise.

larsbuntemeyer commented 8 months ago

Thanks @matthew-mizielinski for the link. I think you are right and the output should be fine and compliant. I was unsure, because cf-checker and also compliance-checker give me same warnings for my examples (and also the MPI file) about that those attributes should not be there. However, i guess that those details might not be implemented. I'll go and maybe search their issues...

larsbuntemeyer commented 8 months ago

Ok, it seems to have been a recommendation before CF-1.11 that

Boundary variables should not have the _FillValue, missing_value, units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes.

which has change in latest recommendation of CF-1.11 (December 2023) to

Boundary variables should not include inheritable attributes, i.e. any of those marked "BI" in the "Use" column of Appendix A.

That just simply doesn't seem to be implemented in the compliance-checker yet...

taylor13 commented 8 months ago

@larsbuntemeyer @matthew-mizielinski I agree the above file is CF-compliant. What do the "checker" warnings say? Are they misleading?

You said "Even if i remove the units attribute, it will write some original_units attribute." How did you remove the units attribute? Did you edit the cmip6_grids.json file? I would have thought we could easily remove the "units" attribute for vertices by simply setting "units": "" for both vertices_longitude and vertices_latitude. Should I suggest that to Chris who is updating CMOR?

larsbuntemeyer commented 8 months ago

What do the "checker" warnings say? Are they misleading?

If i check for CF-1.7, it correctly warns:

compliance-checker -t cf:1.7 tos_Omon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_185001-186912.nc

gives


--------------------------------------------------------------------------------
IOOS Compliance Checker Report                         
Version 5.1.0                                  
Report generated 2024-02-21T08:26:06Z                      
cf:1.7                                     
http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html
--------------------------------------------------------------------------------
Corrective Actions                               
tos_Omon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_185001-186912.nc has 4 potential issues
                                Warnings                                    

§2.5 Variables

§4.1 Latitude Coordinate

§4.2 Longitude Coordinate

§7.1 Cell Boundaries

How did you remove the units attribute? Did you edit the cmip6_grids.json file? I would have thought we could easily remove the "units" attribute for vertices by simply setting "units": "" for both vertices_longitude and vertices_latitude.

Yes, that's what i tried, i set "units": "" for both vertices_longitude and vertices_latitude in the grids file which results in an attribute called original_units = "degrees_north" obviously derived from the parent latitude / longitude coordinate. Note that this all refers to 2D auxilliary coordinates (for 1D native coordinates and bounds there are no attributes in most CMIP6 files i look at)...