Open aportagain opened 5 years ago
@gregchalmers and I had a chat about this, and we think that for the CF-JSON representation, storing the multiple distinct meanings of missing values in a separate variable is the way to go. In that case (and excluding the round-trip issue for now), I think the CF conventions around ancillary data and flags might already / still be sufficient...
The data variable (e.g., hmo
) can use the ancillary_variables
attribute (http://cfconventions.org/cf-conventions/cf-conventions.html#ancillary-data) to reference the status flag variable (e.g., hmo_status_flag
), which will have the standard_name attribute status_flag
. This is in line with the CF conventions wording to use the ancillary_variables
attribute "when one data variable provides metadata about the individual values of another data variable".
The status flag variables (hmo_status_flag
), for the case of mutually exclusive values, can then use the flag_values
and flag_meanings
attributes (http://cfconventions.org/cf-conventions/cf-conventions.html#flags), where the flag_meanings
attribute "is a string whose value is a blank separated list of descriptive words or phrases, one for each flag value".
So the way I understand it at the moment, this would look something like this:
{
"dimensions": { "lat": 4 },
"attributes": {},
"variables": {
"lat": {
"attributes": {"standard_name": "latitude"},
"shape": [ "lat" ],
"data": [ 1, 2, 3, 4 ]
},
"hmo": {
"attributes": {
"standard_name": "sea_surface_wave_significant_height",
"ancillary_variables": "hmo_status_flag"
},
"shape": [ "lat" ],
"data": [ 1.1, null, null, 4.4 ]
},
"hmo_status_flag": {
"attributes": {
"standard_name": "status_flag",
"flag_values": "[0, 1, 2]",
"flag_meanings": "all_good missing_data masked_land"
},
"shape": [ "lat" ],
"data": [ 0, 1, 2, 0 ]
}
}
}
@gregchalmers , could you have a look at the example above and let me know if you think that would serve the purpose? I'd still need to double-check a few things, but if in principle this approach looks good, I'm fairly optimistic we wouldn't have to add anything to the actual CF-JSON spec, just comment and point people towards the relevant existing bits in the CF conventions.
Converting from a binary format, @gregchalmers would like to "tell the users why a value error was returned like _FillValue or a masked value" (https://github.com/metocean/interp-cxx/issues/13). How can we best do this in CF-JSON (possibly including the round-trip issue, if that's something we can or want to ensure for this case)?