Open emiliom opened 8 years ago
@RussellSenior has been working on this and has a preliminary draft of a CDL (at least in terms of fields).
@emiliom my plan is to use the same structure as this: http://data.nodc.noaa.gov/testdata/netCDFTemplateExamples/timeSeries/BodegaMarineLabBuoy.cdl (this is the Orthogonal Timeseries example from https://www.nodc.noaa.gov/data/formats/netcdf/v1.1/). I think the station_name variable would be replaced by the deploymentid in our case, and we'd only have a single instrument (at least initially ... we might need more instruments if we pull in "external" data, for example for tide height in order to have a measure of depth when attached to a fixed structure). Does that sound reasonable?
Jan. 14, From @RussellSenior:
Here's a link to my work-in-progress for a randomly chosen CT sensor deployment (deploymentid = 44): http://www.stccmop.org/data/NCEI/wip/44.nc I switched to using traditional coordinate variables, which made the compliance checker somewhat more happy. Again, many attribute values are empty still, definitely still a work-in-progress. When invoked as follows:
python cchecker.py --test cf --criteria strict --verbose
the compliance checker has the following complaints summarized at the bottom:
--------------------------------------------------------------------------------
Reasoning for the failed tests given below:
Name Priority: Score:Reasoning
--------------------------------------------------------------------------------
3.1 Variables contain valid CF Units :3: 0/ 1 : unknown units type (psu)
for salinity
3.1 Variables contain valid units for t:3: 5/ 7 : units are C, standard_name
units should be K, units
are psu, standard_name
units should be 1
5.2 Latitude and longitude coordinates :3: 0/ 3 :
conductivity :3: 0/ 1 :
coordinates_reference_itself :3: 0/ 1 : Variable conductivity's
coordinate references
itself
salinity :3: 0/ 1 :
coordinates_reference_itself :3: 0/ 1 : Variable salinity's
coordinate references
itself
temperature :3: 0/ 1 :
coordinates_reference_itself :3: 0/ 1 : Variable temperature's
coordinate references
itself
The units I understand, I'm not so sure about the coordinates_reference_itself.
I managed to figure out why my nc files were so huge, it was a combination of netcdf4 and an unlimited dimension and a degenerate default chunksize. Fixed now, file much smaller (~6.5MB instead of 340MB):
Today's version is still missing important attributes, but many more of the low-hanging fruit have been plucked. It currently passes a strict cf compliance check (as above) with only the units complaints.
This version fixes the units for at least the limited test case, and currently gets a perfect 244/244 score on the strict cf compliance check (that doesn't mean it's actually perfect!):
Thanks for these updates, @RussellSenior, and great to see the progress! Apologies for not being on top of it myself last week. I'll try to dedicate time to focus on this tomorrow (Monday).
I still need to look more closely at this: "I switched to using traditional coordinate variables, which made the compliance checker somewhat more happy."
And regarding this, well, WOW:
I managed to figure out why my nc files were so huge, it was a combination of netcdf4 and an unlimited dimension and a degenerate default chunksize. Fixed now, file much smaller (~6.5MB instead of 340MB)
@cseaton, I'm opening this issue to help us track and discuss this to-do item from our meeting. Let us know if you and Russell have made progress since then. BTW, please ping Russell on a reply to this issue, so I have his github profile and so he's automatically included in follow ups.
What I had in mind by "drafting" the CDL is that it may be easiest to manually craft a bare bones CDL for discussion, before doing any coding or actual writing of netcdf files.