OceanGlidersCommunity / OG-format-user-manual

OceanGliders format and vocabularies
15 stars 13 forks source link

Reconsider the use of NetCDF-4 groups for instrument metadata #119

Open kerfoot opened 2 years ago

kerfoot commented 2 years ago

Proponent(s): @kerfoot

Moderator: @OceanGlidersCommunity/format-maintainers

Describe the error

Filing this issue to reconsider the use of NetCDF-4 groups to store instrument metadata for the following reasons:

  1. It is unclear how/if ERDDAP will handle multiple groups during aggregation of 2 or more data sets. I've submitted this question to the ERDDAP Google Groups and strongly recommend reading this thread for background and the current thinking and advice.
  2. The current strategy is not a CF-approved use of groups. See CF guidance on use of groups for more information.
  3. To my knowledge, the current proposed NetCDF-4 groups strategy has not been tested by either simply serving a test data set and/or aggregating the CDL example to ensure that it addresses the original question of how to handle instrument metadata. Please comment with examples if this has been done and I have missed it.
  4. Members of the community expressing reservations against the use of groups in this way, largely for the reasons mentioned above, and there likely are others.

    Example

    I took the CDL example, created a NetCDF file and served it up via ERDDAP:

    http://slocum-test.marine.rutgers.edu/erddap/tabledap/sp041_20191205T1757_n_measurements.html

    As you can see, the data set does not include any of the group meta data information. I have used multiple EDDTableFromFiles ERDDAP data types, but cannot get the groups included in the ERDDAP data set. I filed and additional set of questions on this topic and Bob Simons responded on September 23, 2022 with comments and suggestions. I would also suggest giving this thread a thorough reading before our next DMTT meeting.

    Potential solution

    Use empty scalar variables to attach instrument meta data. Here is an example in which I modified the CDL template from above to remove the groups and replace them with scalar variables:

    http://slocum-test.marine.rutgers.edu/erddap/tabledap/sp041_20191205T1757_n_measurements_instrument_scalars.html

    There is an additional option, though I'm less enthusiastic about implementing it in this spec as I don't believe it provides as elegant of a solution. However, we should discuss both options before proceeding.

    Platforms affected

    None

    Additional context

    As @vturpin mentioned in an email sent out September 23, 2022, it would be good to get Kevin O'Brien on the next DMTT call for some additional thoughts on this subject.

castelao commented 2 years ago

Thanks for opening this issue @kerfoot .

Great idea! It would be great to have @kevin-obrien 's expertise to help us.

About question 2, what is the exact issue with CF Conventions?

Could we list here what are all the reservations? It is difficult to weigh in without knowing what are the issues.

JuangaSocib commented 2 years ago

Hi all, very interesting issue here, thanks @kerfoot. I read carefully all the links and these are my conclusions so far:

justinbuck commented 2 years ago

@kerfoot useful thread on the google ERDDAP group is here: https://groups.google.com/g/erddap/c/7g6ecOZZNNU Includes feedback on the sensor metadata.

The discussion is ERDDAP focused but eh key thing is that we are declaring the file to be CF complaint so we should be reaching a level of compliance that enable the files to work in common tools that depend on CF e.g. ERDDAP, Panoply, etc