Extend support for self-reference in record types

michaelhkay commented 1 year ago

We currently allow a field in a record to have type "..", that is, the same type as the containing record definition.

This isn't good enough for the fn:random-number-generator, where we need something like:

random-number-generator-record:
record(
   number as xs:double,
   next as function() as #random-number-generator-record,
   permute as function(item()*) as item()*,
   *,
)

There are two ways we could tackle this. We could extend the syntax to allow ".." here, so it becomes next as function() as .. Or we could allow named item types to refer to themselves:

<xsl:item-type name="random-number-generator-record"
   as="record(
   number as xs:double,
   next as function() as type(random-number-generator-record),
   permute as function(item()*) as item()*,
   *,
)">

We haven't really reviewed the proposal for named item types. It's easy enough to declare them in XQuery and XSLT (and not really very difficult to define the rules under which self-referential definitions are allowed). Free-standing XPath is a bit more of a problem.

dnovatchev commented 1 year ago

Free-standing XPath is a bit more of a problem.

I imagine we could have something like this in Xpath:

let type $tRandomGeneratorRecordType := record(
   number as xs:double,
   next as function() as $tRandomGeneratorRecordType,
   permute as function(item()*) as item()*,
   *,
)

I prefer this extended syntax:

It is more readable than next as function() as ..
This is more powerful, because we can reference/quote other declared types, not only the containing-record type.

If we had a type object in the XDM, then we could also have a *_`fn:type-of($something as item()) as type`_** as in:

let $myType := type-of($someThing) ...

michaelhkay commented 1 year ago

Note: the original idea of type aliases is due to John Snelson, who also did some work on making them recursive: see https://john.snelson.org.uk/post/48547567561/62081233

michaelhkay commented 1 year ago

When we last reviewed this, @dnovatchev's comment about adding named type declarations to XPath proved something of a stumbling block.

Whenever the question comes up whether some XQuery feature should be added to XPath, we seem to hit the same conflicting visions of the role of XPath as a language. If we add named type declarations to XPath, surely we should also add a lot of other similar things from the XQuery prolog? Like the ability to import a schema? At the moment, we can't even declare namespace prefixes in XPath.

My view of the architectural role of XPath is that it expects to refer by name to objects (such as types, functions, decimal formats etc) whose declarations are found in a static context initialized by some host language. If you add the ability to include these declarations within XPath itself, then you run into a variety of difficulties:

(a) you end up blurring any useful distinction between XPath and XQuery (b) you end up duplicating functionality between XSLT and XPath (c) XPath expressions become bigger, and therefore less suitable for writing as XML attributes in XSLT (d) you'll create pressure for such declarations to be shared by multiple XPath expressions, which will lead to a demand for some kind of module or inclusion mechanism at the XPath level.

I'm therefore inclined to resist any attempt to grow XPath into the role of a free-standing XML processing language, and instead to develop it within the constraints of its existing intended purpose as an expression language to be hosted in some other language responsible for establishing its static context.

dnovatchev commented 1 year ago

If you add the ability to include these declarations within XPath itself, then you run into a variety of difficulties:

(a) you end up blurring any useful distinction between XPath and XQuery

There is a definite distinction. XQuery has unique features that are not XPath features: XQuery databases, Update Facility, Sliding windows, annotations, XML-based query syntax, (and maybe many other of which I am not aware due to not being an XQuery user) etc. ... - just to name a few such powerful features.

(b) you end up duplicating functionality between XSLT and XPath

There would be no duplication if XSLT doesn't duplicate functionality that is in XPath. In fact XSLT 3 has grown so overwhelmingly complicated that not continuing to add new "features" to it at the same rate would actually be beneficial both to the XSLT language and to its users. We have good evidence for this: even Dr. Michael Kay (@michaelhkay) himself has not been able so far to produce a new book on XSLT 3.0 -- something that many users are asking for.

(c) XPath expressions become bigger, and therefore less suitable for writing as XML attributes in XSLT

Nobody is forcing anyone to write long XPath expressions in XSLT. Users typically can just call functions that are defined in XPath function libraries.

The main way to write readable XPath expressions is the same as doing this in any programming language: write the code on multiple lines with good indentation.

(d) you'll create pressure for such declarations to be shared by multiple XPath expressions, which will lead to a demand for some kind of module or inclusion mechanism at the XPath level.

This is already possible in XPath 3.1, thus no new, additional feature is needed.

I'm therefore inclined to resist any attempt to grow XPath into the role of a free-standing XML processing language, and instead to develop it within the constraints of its existing intended purpose as an expression language to be hosted in some other language responsible for establishing its static context.

Hosting XPath in another language and using it mainly by itself are not two mutually exclusive possibilities. Development of major functionality in pure XPath makes this immediately accessible and usable from all hosting languages, and this is a big Plus, compared to the case when one would need to make redundant XSLT and XQuery code (doing the same thing) and that has all the problems that redundancy has. There are definite problems: maintainability issues between different hosting languages such as being out of sync, and taking extra time and effort even to verify and confirm that both the "XSLT copy" and the "XQuery copy" are equivalent, not even speaking of having to test essentially the same code N times for given N hosting languages-- just to mention a few of these.

Thus, it is better to keep in their common XPath subset at least the most-fundamental functionality that is common to different hosting languages. And the concept of type is exactly one such basic, very fundamental item's property.

I personally know people, including developers of some of the best XSLT programming environments, who declined to use XSLT 3.0 and turned entirely to other languages. Maybe they wouldn't have turned away, had there been a common to XSLT, XQuery and other hosting languages version of XPath that would be both more elegant and still sufficiently powerful.

Note that this is not the first and best known example of different hosting languages sharing an extremely powerful "sublanguage":

LINQ is available not only in the .NET - based languages such as:

C#,
F# and
VB.NET, but has also been ported to
PHP (PHPLinq),
Javascript (linq.js) and
ActionScript (ActionLinq)

michaelhkay commented 1 year ago

But one of the big benefits of the LINQ approach is that it doesn't try to do everying in the XML navigation sublanguage: it reuses things like functions, variables, types, and modules from the host language. Many of the declarations are likely to be at the level of a module, or imported by a module from other modules. In the context of this specific issue, I think this is definitely likely to be true; type declarations won't be associated with individual XPath expressions, but with an entire application.

Perhaps what this suggests is a requirement for an XPath-like (or XQuery-prolog-like) syntax for declaring static context which can be invoked by a host language at a different level from the actual XPath expressions. Along the lines:

var xpathContext := "declare type geo:position = record(latitude, longitude); " +
                                    "declare namespace geo = '...'; 
                                    "declare function geo:distance($from as geo:position, $to as geo:position) {...};";

var distance = xpath("geo-distance($x, $)", xpathContext);

The trouble is, this drags us into designing APIs for host languages, which is currently way outside our scope. But we could probably abstract away from that; we could define a syntax for defining an XPath context, and leave the API designer to work out how to incorporate this into an API design.

But then: another lesson from LINQ is to use as much of the host language syntax as you can. Why would you want to use an XPath sublanguage to define the context, rather than defining an XPathContext as a host language object that can be populated using host language methods?

In fact, I'm not sure I want to try and compete with LINQ. If I need to do XML navigation within the context of a powerful procedural language like Java or C#, I'm going to use a LINQ-like approach in preference to an XPath API most of the time. I think our primary target for XPath should be the use case that originally motivated it: embedding within XSLT and other similar languages like XForms, XSD, and XProc. And in those cases, I think it's entirely appropriate for the host language to define the syntax for initialising the XPath context.

michaelhkay commented 1 year ago

I wrote:

(d) you'll create pressure for such declarations to be shared by multiple XPath expressions, which will lead to a demand for some kind of module or inclusion mechanism at the XPath level.

and @dnovatchev responded

This is already possible in XPath 3.1, thus no new, additional feature is needed.

But it's not possible. If a host language invokes two XPath expressions like

then there is no way, using XPath alone, to declare things such as functions, variables, and types that are accessible both within xxx and yyy. Unless we write the whole thing as a single XPath expression, we're reliant on the host language to define the shared context for the two expressions. The only way it makes sense to define the shared context in XPath is if the whole application can be written as a single XPath expression. If we're targeting XPath primarily as a sublanguage for use within a host language such as XSLT, which I believe we are, then the context for xxx and yyy needs to be defined in one place, which can't be in either of those expressions. Contrariwise, if we do want to grow XPath to a language that is suitable for writing the whole application, then we are re-inventing XQuery.

michaelhkay commented 11 months ago

Note that an additional problem we need to handle with self-referential record types is the rules for subtyping. §3.7.2.9 currently ignores this problem (the rules as currently defined are probably non-terminating). It's discussed in John Snelson's blog at https://john.snelson.org.uk/post/48547567561/62081233

dnovatchev commented 11 months ago

I wrote:

(d) you'll create pressure for such declarations to be shared by multiple XPath expressions, which will lead to a demand for some kind of module or inclusion mechanism at the XPath level.

and @dnovatchev responded

This is already possible in XPath 3.1, thus no new, additional feature is needed.

But it's not possible. If a host language invokes two XPath expressions like
then there is no way, using XPath alone, to declare things such as functions, variables, and types that are accessible both within xxx and yyy.

Actually, there is a way. I have shown several times to the CG examples of sharing functions and variables (bindings) between different/independent XPath expressions. And I will have a demo of this again next week.

This sharing is done via loading common XPath function libraries.

ndw commented 11 months ago

That there is technically a way to make it work in your environment doesn't resolve the fact that there isn't a standard way to do it generally.

dnovatchev commented 11 months ago

That there is technically a way to make it work in your environment doesn't resolve the fact that there isn't a standard way to do it generally.

There is nothing non-standard in the XPath function libraries. Anyone can download FunXPath and start using it, and/or write their own XPath function libraries in addition to this.

We already have a standard function in XPath 3.1: fn:load-xquery-module. In case a particular implementation doesn't provide this standard function, one may use another standard XPath 3.1 function: fn:transform to load an XPath function library.

FunXPath contains both loaders so the user doesn't have to write a single thing in addition and can directly start loading and using the members (functions or bindings) of existing XPath function libraries, or write their own XPath function library.

They will enjoy a the write once use everywhere property of XPath, lack of redundancy, minimized maintainability costs, no vendor lock-in, instant portability.

michaelhkay commented 11 months ago

But if you're doing

<xsl:when test="xxx" select="yyy">

there's no way you would want to initialise the context for evaluating xxx and yyy independently; you want to initialise the context once and reuse it.

dnovatchev commented 11 months ago

But if you're doing

<xsl:when test="xxx" select="yyy">

there's no way you would want to initialise the context for evaluating xxx and yyy independently; you want to initialise the context once and reuse it.

Sorry, @michaelhkay , it seems difficult to understand the meaning here.

Isn't this equivalent to:

  if(xxx) then yyy

What is the issue?

michaelhkay commented 11 months ago

The issue is that if someone is using XPath the way it is designed to be used, as lots of small XPath expressions bound together using the capabilities of a host language like XSLT, then the only opportunity to define a shared context for those XPath expressions (for example a function library) is using the capabilities of the host language.

You seem determined instead to use XPath in the role that XQuery is designed for, as a complete language in its own right rather than as a sublanguage; and since XQuery is designed for that job, I have trouble seeing why you prefer to add features to XPath to make it capable of fulfilling that role.

I don't want to add things to XPath that are not needed when it is used as a sublanguage.

ChristianGruen commented 11 months ago

Anyone can download FunXPath and start using it […]

Not really; it’s hidden to the public.

We already have a standard function in XPath 3.1: fn:load-xquery-module. In case a particular implementation doesn't provide this standard function, one may use another standard XPath 3.1 function: fn:transform to load an XPath function library.

Taking advantage of features from a different language and claiming you are the one providing these features is something I call “Etikettenschwindel” (bogus claim?). To be consistent, you should add fn:doc and fn:unparsed-text to your list, as these function can be utilized to return you whatever you want – provided there’s an engine under the given URI that delivers the desired result.

dnovatchev commented 11 months ago

The issue is that if someone is using XPath the way it is designed to be used, as lots of small XPath expressions bound together using the capabilities of a host language like XSLT, then the only opportunity to define a shared context for those XPath expressions (for example a function library) is using the capabilities of the host language.

You seem determined instead to use XPath in the role that XQuery is designed for, as a complete language in its own right rather than as a sublanguage; and since XQuery is designed for that job, I have trouble seeing why you prefer to add features to XPath to make it capable of fulfilling that role.

I don't want to add things to XPath that are not needed when it is used as a sublanguage.

I don't want to take part in religious arguments - let the users decide for themselves.

Didn't Mary Holstege wanted to be able to write code that is host-language independent?

dnovatchev commented 11 months ago

Anyone can download FunXPath and start using it […]

Not really; it’s hidden to the public.

I haven't granted access to everyone (public), as there is a cleanup pending.

At the same time, anyone requesting access will be granted such, the same as was in your case.

We already have a standard function in XPath 3.1: fn:load-xquery-module. In case a particular implementation doesn't provide this standard function, one may use another standard XPath 3.1 function: fn:transform to load an XPath function library.

Taking advantage of features from a different language and claiming you are the one providing these features is something I call “Etikettenschwindel” (bogus claim?). To be consistent, you should add fn:doc and fn:unparsed-text to your list, as these function can be utilized to return you whatever you want – provided there’s an engine under the given URI that delivers the desired result.

Yes, fn:unparsed-text is used internally in the loader, but all this is hidden behind the covers following the established good practice that the users don't need to be aware of any implementation details.

Speaking about "implementation details", BaseX doesn't even implement either of the F&O 3.1 standard functions: fn:load-xquery-module and fn:transform.

Isn't it time to become standard-compliant?

ChristianGruen commented 6 months ago

The PR for this issue was merged. If this issue should be kept open, just remove my label, which proposes to close it.

ndw commented 6 months ago

The CG decided to close this issue without any further action at meeting 069.

qt4cg / qtspecs

Extend support for self-reference in record types #295