dcmi / dcap

DC Tabular Application Profile - supporting materials
28 stars 12 forks source link

Are the prefixed URIs we use the same as "CURIEs"? #76

Closed tombaker closed 2 years ago

tombaker commented 3 years ago

CURIE Syntax 1.0 - A syntax for expressing Compact URIs is a W3C Working Group Note from 2010. The examples provided in the spec include things like dc:creator, which is exactly the sort of syntax we have been using in our examples. The concept of CURIE is broader than QName which, for example, does not allow prefixed names where the name is a number, such as isbn:12345678.

I have been referring to these things as "prefixed URIs", but I see no reason why they could not be called "CURIEs".

On the other hand, I do not recall the last time I heard this term used, so I do not know whether the term has actually caught on, or even been superseded by more recent terminology. The Wikipedia page for CURIEs says nothing about its reception and current use.

It would be convenient for us if we could just call them "CURIEs".

kcoyle commented 3 years ago

I don't see CURIE used much. The document is a working group note, not a recommendation, which may have hindered its adoption. It also suffers some from ambiguity (although not in our context) because it is a well-known unit of radioactivity named, of course, for Mme. Curie. I like "prefix" because that is what it is called in code so there is a direct link; "curie" doesn't appear anywhere in code, AFAIK.

tombaker commented 3 years ago

@kcoyle Are you suggesting we continue calling them "prefixed URIs"? "Prefix" is just the part before the colon, and "prefixed URIs" - which is what I have been using, for lack of a better term - doesn't quite work for me because it implies that the URI itself is somehow prefixed.

Could we perhaps call them "compact URIs"? We could define the term in the glossary, and even make reference to the CURIE spec, but without using "CURIE" itself.

ericprud commented 3 years ago

iirc, CURIEs and prefixed names (or "PNAMES", in the SPARQL lineage of grammars) both use prefixes to shorten IRIs. As long as you stick to letters, they'll look the same. However, they're both tuned to fit in their respective lexical environment. For instance, CURIEs are bounded by ""s so they can have trailing '.'s but all ""s must be escaped. PNAMES, otoh, are bounded by whitespace or any SPARQL terminal, and so can't end with a " " or "." or "<".

tombaker commented 3 years ago

I discussed this further with Eric. PNAMES are used in SPARQL and Turtle, and CURIEs are very similar but are based on a slightly more permissive grammar (for example, a CURIE can end with a dot). He agreed with my point that our spec need go into such details but argued for being specific enough to help more technically minded implementers. We agreed on a position which I would like to propose as a resolution:

PROPOSED

That we refer to the syntax of "prefixed" URIs, in the DCTAP spec, as "compact IRIs", with a reference to the CURIE spec.

kcoyle commented 3 years ago

@tombaker I don't actually see anywhere that we use "prefixed URIs" in our documents. The wording in the primer is instead:

"Property IRIs are usually shortened using defined prefixes" "the full IRI of http://www.w3.org/2001/XMLSchema or preceded by a prefix (often "xsd:")" (for value types) "these may be shortened using a stated prefix" (appendix on namespaces)

Do you think that these statements are not clear?

tombaker commented 3 years ago

@kcoyle The statements are clear as far as they go, but they do not say much about how that shortening is done and they do not name the result of that shortening. We can say that the value for a given element must be an "IRI or IRI that is shortened using defined prefixes", but it would be nicer and more precise to say something like "IRI or compact IRI", then point to specifications that discuss what this entails.

I have learned for example, that the QNames used in XML Schema and the PNames used in SPARQL and Turtle are both subtly more restrictive than CURIEs. I think we agree that we do not necessarily want to bring more acronyms into the mix with "CURIE", but we could finesse this by referring to "compact IRIs", then citing the CURIE spec in a glossary entry.

tombaker commented 3 years ago

@tombaker I don't actually see anywhere that we use "prefixed URIs" in our documents.

@kcoyle You are right of course. Perhaps it was just me who was using it. I didn't know what to call the shortened URIs so I referred to them as "prefixed URIs" and made a mental note to look for a better alternative.

corined commented 3 years ago

The RDA Registry uses both "compact URIs" and CURIE" http://www.rdaregistry.info/rgData/rdaCuries.html

kcoyle commented 3 years ago

I like "compact URI" - unless anyone objects, I'll see if that fits into the documentation.

ericprud commented 3 years ago

The "compact URI" name collision with RDA can be resolved if your compact URIs are the same or if you reference a section or document which spells out what they are.

kcoyle commented 3 years ago

@ericprud The full URIs and their prefixes will be provided, albeit in a document that parallels the tabular profile. We haven't spec'd that out yet, however, but are looking at the solution at CSV on the Web. See #66 and Phil's comment.

tombaker commented 3 years ago

@kcoyle I'm fine with "compact URI" but take @ericprud 's point that we should say what we mean, which we could do simply by saying, in a sentence, what a "compact URI" is (perhaps in a Glossary) and by citing the CURIE specification.

This came up on the DCTAP implementation call and @hsolbrig gave his thumbs-up to "CURIE", which I gather is used routinely in his world (answering the question I asked in the introduction to this thread, above).

kcoyle commented 3 years ago

I'll look through the document, and where we do refer to them I'll do something like "compact URIs using prefixes, known as CURIEs".