Open ethanrd opened 3 years ago
Maybe "reserved for the netCDF format itself and not intended for use by end-users"?
Possibly relevant is that the streaming formats (DAP2 and DAP4) insert underscored attributes having special meaning to those protocols. So it is also the case that data being streamed should not have underscore attributes to avoid name conflicts in this case.
As @DennisHeimbigner points out, we should keep in mind that while some of these underscore attributes are encoded in the file directly, some are added after the fact (occasionally encoded directly, but are often added in-memory by a specific implementation). An example of the latter from the netCDF-Java library would be the _Coordinates
attribute, which IOSPs can add to help in the coordinate system layer of the netCDF-Java library (even to existing netCDF files); an example from the netCDF-C side would be the _IsNetcdf4
or _SuperblockVersion
attributes. The attributes added by dap2
and dap4
are another example.
I think we need to make a clear distinction between the two cases, with a heavy emphasis on those encoded into the on-disk format. For example, we should at a minimum answer the questions "what underscore attributes must be encoded into a file in order for it to be considered a netCDF file?", and "what format should their values take?". Since netCDF-4 files are not versioned, we can only say "must" about things that are true about files created from netCDF-C v4.0.0 onward. We can then add recommendations (strong ones, even) about new attributes that have appeared onto the scene (e.g. _NCProperties
), including when they first started showing up and the motivation for their addition, but we cannot say must at this point.
Once that's settled, we could add a more generic "Any other underscore attributes, whether encoded directly in the file or added in-memory, are reserved for use by individual netCDF libraries." The _Coordinates
attribute falls under this category. I would go even further and say that we should state that in general, encoding new underscore attributes into a file is strongly discouraged in favor of using existing metadata conventions."
Namespaces for attribute names? (Not that this helps the current issue.)
The LinkedData for netCDF folks are using bald__
(Binary Array Linked Data) as a pseudo namespace prefix. CF had an attribute namespace discussion (Trac 27) years ago that was leaning to use a colon (':') but then the discussion stalled.
Would it be appropriate to have a namespace for attribute names "standard" defined in the NUG? Or is CF and other standards a better place for that? (Having it the NUG would allow us to reserve "nc__" or whatever for future use.)
The NUG "Attribute Conventions" appendix states that
This should be changed to "reserved for use by netCDF libraries" (or "netCDF implementations").
The phrase
is used in the following sections:
Note: Or, to be super clear, maybe "software that directly implements reading and writing of netCDF datasets". Except that doesn't deal with libraries that wrap the HDF library. Drop "directly" or switch from reads and writes to "(un)encodes". Yuk! Maybe not so explicit is better.