ioos / compliance-checker

Python tool to check your datasets against compliance standards
http://ioos.github.io/compliance-checker/
Apache License 2.0
109 stars 58 forks source link

Confirm error reporting levels for missing units and datatype checks for uint8 working correctly #973

Closed mwengren closed 1 year ago

mwengren commented 1 year ago

This is a repost from a comment on cf-conventions/discuss: https://github.com/cf-convention/discuss/issues/191

Is the datatype error reported an issue we need to resolve? We'd have to reach out to the issue author for test files if so.

This is the comment regarding IOOS Compliance Checker from original issue:

Thanks for the suggestion. Providing long_name helps!

ubyte palette(rgb, eightbitcolor) ;

    palette:long_name = "Recommended color table: viridian" ;

    palette:units = "1" ;

            palette:valid_min = 0UB ;

            palette:valid_max = 255UB ;

Checking against with IOOS Compliance Checker reported missing units as an error. Also reports as an error: §2.2 Data Types The variable palette failed because the datatype is uint8 It seems to want char or string?

Adding valid_min and valid_max makes no difference, but seems to be a good idea.

cf-checker does not complain about ubyte, and reports missing units as an info item only.

The latest available standard for both compliance checkers is CF-1.8. Neither seems to check variables inside a group. Oh, how I wish the cf-conventions team would provide a checker, updated for each version!

benjwadams commented 1 year ago

§2.2 Data Types The variable palette failed because the datatype is uint8 It seems to want char or string?

CF 1.8 is latest supported checker, 1.9 adds unsigned int support.

We don't have a ton of files with groups, I'll have a look at the example file.

gfireman commented 1 year ago

Here's a real-life example of a CDL with groups, ubyte datasets, and datasets without units: SNPP_VIIRS.20221111T065400.GEO.cdl.txt

benjwadams commented 1 year ago

Neither seems to check variables inside a group.

Can confirm. NetCDF4.Dataset.variables will only return variables within the root group. We'll have to rework the calls to dataset.variables to take into consideration variables in groups.

benjwadams commented 1 year ago

Both checkers (compliance-checker and cf-checker) look at root group only for the most part unless I've misinterpreted a search for the code source . There are some routines in the CF 1.8 checker in ours to see if some names aren't outside of the root group, among other things. It's more complicated than it looks to check some of these variables in groups and nested groups, but I'll think over the best way to do this.

benjwadams commented 1 year ago

Closing due to support added in new CF 1.9 checker for unsigned variable data types. Cross-group checking is more involved and will require further implementation, feel free to open a separate issue.