shexSpec / shex

ShEx language issues, including new features for e.g. ShEx2.1
25 stars 8 forks source link

ShExJ LanguageStem should be STRING instead of ObjectLiteral #54

Closed hsolbrig closed 7 years ago

hsolbrig commented 7 years ago

Currently, LanguageStem is represented as an ObjectLiteral, meaning that it has a value, an optional language and an optional type. This is an issue because:

  1. ShExJ models this as LANGTAG, which does not represent language or type
  2. I have no idea what should be done with something in the form: "stem":{"value" : "en", "language": "fr"}

Similarly, the exclusions in LanguageStemRange include the possibility of an IRI. Same issue - no equivalent in ShExC, as an exclusion is LANGTAG '~' and IRI's simply don't apply.

Recommendation: LanguageStem { stem:ObjectLiteralSTRING } LanguageStemRange { stem:(ObjectLiteralSTRING|Wildcard) exclusions:[objectValue STRING|LanguageStem +]? }

Results of this recommendation can be found: https://github.com/shexSpec/shexTest/compare/master...LanguageStem-string

Alternative: Same change, but tighten up the parser rules on LANGTAG:

ObjectLiteral {value:STRING language:STRINGLANGTAG? type:STRING? }

LanguageStem { stem:ObjectLiteralLANGTAG } LanguageStemRange { stem:(ObjectLiteralLANGTAG|Wildcard) exclusions:[objectValue LANGTAG|LanguageStem +]? } LANGTAG : ([a-zA-Z])+ ("-" ([a-zA-Z0-9])+)*

gkellogg commented 7 years ago

As ShExJ is JSON-LD, all values are interpreted either as IRI, BNode, or Literal. In some cases, we can use a string value for a property, but only if that property is not described in the context to contain an IRI. Currently, the context defines “stem” as “xsd:anyUri”, and “exclusions” as “@type”: “@id”, as that is the sense for their use in IRIs. JSON-LD 1.0 doesn’t have a way to override this without using the expanded version (thus: {“value”: “fr-“}, for example).

We have discussed going to JSON-LD 1.1 (still in progress as a community project), which does allow us to override term definitions based on, for example, “@type”. So, we could describe one sense for “stem” and “exclusions” when used with “IriStem”, and another when used with “LanguageStem”.

Another reason to use JSON-LD 1.1 is because of it's support of id maps, which would allow the schema to use an object to describe shapes, much as the pre JSON-LD version did.

Until we do this, changing LanguageStem and LanguageStemRange (along with LiteralStem and LiteralStemRange) will not work with the JSON-LD/ShExR interpretation.

ericprud commented 7 years ago

What if we say that stems are all strings (no xsd:anyUri)?

gkellogg commented 7 years ago

We'd need to do that for both stems and exclusions, but it should work. That would change ShExC as well, I think.

ericprud commented 7 years ago

I made a IriStem-string branches in shexTest and shex.js which treat all stems as strings and round-trips to ShExC.

ericprud commented 7 years ago

closed in meeting 14 April 2017 meeting