Open amilan17 opened 9 months ago
@tomkralidis
Perhaps we can further qualify with:
cc @josusky
This is too restrictive. The regular expression that you have provided is correct only for the "namespace identifier" NID part. But the NID is fixed in our case to wmo
. The rest of the URN is "Namespace Specific String" (NSS) and its validation is more benevolent. Original description is in https://www.rfc-editor.org/rfc/rfc2141.html (section 2.2) and is slightly modified (extended) by newer RFC (https://www.rfc-editor.org/rfc/rfc8141). Example of a valid URN is:
urn:example:a123,z456?+abc
I am not deadly against a rule that is more strict than actual URN specification. I looked up the specification because I spotted the innocent dot (.
) in Tom's list - that "lifted me off the chair" :-)
I can hardly imagine anyone putting ~
or ]
into metadata ID but a dot (.
) or slash (/
) seem quite OK to me.
Having a slash (/
) in the ID introduces URLs like the following in the GDC:
https://example.org/collections/foo/items/foo%2Fbar
While we can relax the regex set mentioned previously, the above would be error prone.
The definition as approved during PR #183. "The id property SHALL include a local identifier as defined by the data publisher. The local identifier SHALL NOT have spaces or accented characters."
TT-WISMD 2024-10-22:
Specifying a character set that does not have accented characters and other things that can complicate the usage of this identifier is a good idea. IRA T.50 is an appropriate choice. Apart from that (and the space), did you discuss some more restrictions during TT-WISMD 2024-10-22?
|D |The
+id+
property shall include a local identifier as defined by the data publisher. The local identifier shall not have spaces or special or accented characters.The question is what are "special" characters?