opengeospatial / netcdf-ld

Encoding standard to enable RDF graphs to be encoded in and interpreted from netCDF files
http://www.github.com/opengeospatial/netCDF-Classic-LD
Other
7 stars 5 forks source link

Using the convention ":" as a namespace separator #50

Open martinjuckes opened 3 years ago

martinjuckes commented 3 years ago

I'm sorry if this has been covered elsewhere, but I was wondering why you use "__" rather than ":" or "@" to indicate that a variable has a namespace?

The use of the colon is supported since NetCDF 3.6.3 (according to NUG), and "@" has been supported before that. The "@" gives clearer CDL, e.g. float cmip@tas rather than float cmip\:tas, where the escape character is intrusive.

marqh commented 3 years ago

Hello @martinjuckes

my reading of the NUG is that : is not an allowed character, that this is used as a separator in the netCDF syntax. It is not clear to me that the allowance of backslash escaped : characters is supported.

The choice of __ was driven (in my mind) by the lack of access to the : character.

The use of @ does seem valid, and represents a plausible option.

The choice of __ was a very early suggestion, predating the work on this document, and inherited from usage patterns that had already begun to use __ to designate a namespace.

I'm open to discussions on the use of @, but I don't think that : is a viable option. If my reading of the NUG is erroneous, then I would welcome further feedback on this. If a backslash escaped character were allowed and we could be confident that versions prior to 3.6.3 could be flagged as never supported then it may be plausible, but I agree it seems intrusive.

If close consideration to adopting the @ instead of the __ then we would have to be mindful of the required rework, but also the usage of the __ as a namespace separator in existing data. Some data creators have already started using this separator, in the hope that this standard will become adopted.

A continuing discussion on this topic would be useful and helpful

many thanks mark

@jyucsiro @adamml

jyucsiro commented 3 years ago

At the time (2015?), __ was a choice that seemed to be most compatible with all versions of netCDF. the @ symbol is an interesting option. Are there existing examples of the use of @ for namespace in other initiatives/conventions?

ethanrd commented 3 years ago

The nice thing about the double underscore (__) is that the underscore does not need to be escaped anywhere (at least as far as I know). Whereas the at-symbol (@) should be encoded in URLs. The @ must also be encoded when found in OPeNDAP DAP2 variable names (see section 5.1), though apparently not in DAP4 (section 10.2.2).

martinjuckes commented 3 years ago

@marqh : I don't see how the NUG statement "Beginning with versions 3.6.3 and 4.0, names may also include UTF-8 encoded Unicode characters as well as other special characters, except for the character '/', which may not appear in a name." can be interpreted as excluding : .. it looks pretty clear that it is included.

@jyucsiro : I can believe that things looked different in 2015. I was quite surprised when I looked at NUG and discovered how much freedom there now is in the range of characters that can go into variable names. If you want consistency with other conventions, then : is the obvious choice. I guess @ is more often associated with a kind of inverse name-spacing, as in martin.juckes@stfc.ac.uk in which stfc.ac.uk is the de facto namespace.

@ethanrd : OK, so you want to be able to use the combined namespace abbreviation plus variable name in a URL. Its a reasonable thing to want, but should it be driving this standard? If the aim here is to specify how to encode namespaces within a NetCDF file, then there is no reason to use the same encoding when referring to the same concept in a different application space.

Regarding OPenDAP: I would take the more recent standard as the basis. Just a personal preference, perhaps, but I think it will take some time to get the ideas being presented here established, and the 2007 DAP2 protocol will not be relevant for that much longer.