expath / xpath-ng

Wishlist for XPath Syntax Extensions
Creative Commons Attribution 4.0 International
12 stars 4 forks source link

Proposed syntax for anonymous union types #6

Open michaelhkay opened 5 years ago

michaelhkay commented 5 years ago

The syntax union(A, B, C) is added as a new kind of ItemType

rhdunn commented 5 years ago
  1. Should I add a separate specification for the (A|B|C) syntax in addition to the union(A, B, C) syntax, or do you want to add it to this specification?
  2. The QName production should be EQName to support using NCName (via the default element/type namespace) and URIQualifiedName constructs.
  3. Shouldn't this allow more generic ItemType objects, allowing for things like union(map(*), array(*))? Or should this be just for XMLSchema-like unions, and the (A|B|C) syntax be for generic ItemTypes?
michaelhkay commented 5 years ago

Yes, it should be EQName.

This is syntax for defining union types (in the XSD sense - unions of atomic types), which already have well-defined semantics because you can already use a named union type in a function signature or variable declaration. Unions of arbitrary item types (e.g. element() or attribute(), map or array) require a lot more work because the semantics need defining, e.g. the rules for item matching and type subsumption, the impact on the function conversion rules, etc etc. - so that's beyond the scope of this proposal. I don't foresee any major technical obstacles with doing it, but it's a significant piece of work to find all the implications. The impact on XSLT 3.0 streamability rules, for example, is a potential nightmare.

rhdunn commented 5 years ago

That makes sense. I also like that the union() syntax applies to CastExpr and CastableExpr that take a SingleType.

Note also that a union with union types should also be possible, not just atomic types (e.g. union(xs:numeric, xs:string)) as XMLSchema defines transitive membership of types to support this.

I will write a complementary proposal for using the (A|B|C) style syntax for defining ItemType-based unions and we can discuss the details there.

adamretter commented 5 years ago

Seems like a good proposal to me.

Just wondering if the EBNF instead of (assuming @rhdunn's suggestion of EQName):

LocalUnionType ::= "union" "(" EQName ("," EQName)* ")"

Should/could be:

LocalUnionType ::= "union" "(" EQName+ ")"
michaelhkay commented 5 years ago

I think that lists of names in XQuery/XPath are generally comma-separated rather than space-separated. There are a few cases of space separation, e.g. the list of keyword=value pairs in a decimal format declaration, but they are rather rare. (XSD of course uses space separation, and XSLT does so also in many attributes e.g. xsl:strip-space/@elements, xsl:template/@mode, etc, but for XPath and XQuery the comma feels right.)

adamretter commented 5 years ago

@michaelhkay Good point :-)

adamretter commented 5 years ago

I think this one is almost ready to Merge.

@michaelhkay Do you want to address any of the comments from @rhdunn?

@ChristianGruen Do you have any comments and/or objections?

ChristianGruen commented 5 years ago

Looks good!

rhdunn commented 5 years ago

Of my comments, the following remain relevant:

  1. Update the syntax to refer to EQName instead of QName for the type, and add to the semantics that NCName types should be resolved with the default element/type namespace from the static context. This has been agreed, it's just the proposal text that needs updating.
  2. Add to the semantics that only union or atomic types are allowed in the union construct.
  3. Add to the semantics that the members of the union follow the XSD transitive membership rules for union types. Should these be explicitly documented in the XPath/XQuery spec?