kg-construct / rml-test-cases

RML conformance test suite
http://rml.io/test-cases/
Creative Commons Attribution 4.0 International
4 stars 11 forks source link

RFC: RMLTC0015b-CSV & RMLTC0015b-JSON & RMLTC0015b-XML #15

Open pmaria opened 4 years ago

pmaria commented 4 years ago

RFC for:

The R2RML spec states:

A term map with a term type of rr:Literal may have a specified language tag. It is represented by the rr:language property on a term map. If present, its value must be a valid language tag.

The question is what "valid" means here. In RFC 3066 (referenced by the R2RML spec) there is no explicit definition of validity.

Validity is defined in successor BCP 47, and requires next to the language tag being "well-formed", require its "subtags appear in the IANA Language Subtag Registry as of the particular registry date".

Looking at https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal, that requires a language tag to be well-formed, but not valid per se.

From all this it isn't completely clear at the moment which requirements an RML engine should follow.

Now, besides all that, looking at these test-cases, the language tags that are supposedly invalid are "english" and "spanish", yet, taking both RFC 3066 and BCP 47 into account, both are well-formed language tags. They are however invalid according to the BCP 47 definition. But, I have doubts that this level of validation should be required by engines. That would require an engine to keep up with the IANA Subtag Registry.

Proposed:

chrdebru commented 3 years ago

I agree. Especially given that RDF 1.1 only requires well-formed language tags. We could have two test cases: