pchampin / sophia_rs

Sophia: a Rust toolkit for RDF and Linked Data
Other
210 stars 23 forks source link

Check for invalid literals as ouput of parsers #140

Open pchampin opened 8 months ago

pchampin commented 8 months ago

The current Turtle file is valid according to the grammar:

[] <tag:> "foo"@abcde .

However, abcde is not a valid BCP47 language tag, so the produced graph is not valid according to the RDF abstract syntax.

Currently, Sophia blindly trusts the output of the parsers, creating the corresponding language rag with LanguageTag::new_unchecked, which results in a value violating the contract of its type.

The output of parsers should therefore be checked, at least for the language tags.

pchampin commented 8 months ago

A similar issue, which I raised here, occurs when a literal is explicitly typed as rdf:langString but has no language tag. This also should be checked -- unless Turtle (and other concrete syntaxes) are changed to reject it.

pchampin commented 8 months ago

False alarm... abcde is in fact a valid BCP47 tag, my mistake. On the contrary, abcdefghi is an invalid tag, and is correctly rejected by all Sophia parsers.

"foo"^^rdf:langString may still be an issue, though, pending the decisions of the RDF 1.2 WG.