w3c / rdf-n-triples

https://w3c.github.io/rdf-n-triples/
Other
3 stars 3 forks source link

Support for base direction #32

Closed gkellogg closed 11 months ago

gkellogg commented 1 year ago

Relates to w3c/rdf-concepts#9 and discussed in the Text Direction Proposal.

This will support directional language-tagged strings using the following syntax:

LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ('--' [a-zA-Z0-9]+)?`

This adds some text to 2.4 RDF Literals to allow base direction and an additional triple to EXAMPLE 4. Also, additional rules in 6.1 RDF Term Constructors.

gkellogg commented 1 year ago

Alternatively, we could recognize only the directions ltr or rtl.

LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ('--' ('ltr'|'rtl'))?`

which would be matched in a case-sensitive manner, which reduces runtime checks at the expense of parse errors which could convey less meaningful error messages.

pfps commented 1 year ago

See https://github.com/w3c/rdf-n-triples/issues/33

afs commented 1 year ago

SPARQL is a bit more sensitive to the choice because of use in triple patterns and also in expressions.

There is a case of the sequence --1 (no spaces) being legal syntax in an expression. It has a non-nonsensical meaning. --1 is subtraction of a negative number: - (-1). As the left hand side is a directional language tagged string and the right hand side is a number, subtraction is an evaluation error.

It is easy to avoid by having no numbers in the direction part, c.f. the first subtag "language".

LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ('--' [a-zA-Z]+)?`

My preference is to have the ltr,rtl check happen the same way that the requirement to be a legal language tag happens.