New function fn:query()

michaelhkay commented 4 months ago

I propose a new function fn:query() to perform dynamic XQuery/XPath evaluation: a similar role to fn:transform() and xsl:evaluate.

I propose a design based on the design of fn:invisible-xml() - fn:query should take a query string as its argument, and return a function item that can be called to evaluate the query.

The fn:query() function will need an options map to supply significant aspects of the static context, for example the base URI. But I don't think we need to support everything. Public functions in the calling module should probably be made available automatically, in which case we don't need to support "import module".

The dynamic evaluation function will need an options map to supply significant aspects of the dynamic context, notably the context value and values of external parameters. The query result should be returned in "raw" (ie. unserialized) form.

Perhaps there should be an option language="xpath" to say that the "query" is actually an XPath expression; some implementations might find that easier to support, especially when the processor is itself an XPath processor.

(Motivation: Saxon has a pair of ancient extension functions saxon:compile-query and saxon:query and the design needs modernising, and bridging across to additional platforms. We might as well get it into the standard if we're doing that.)

ChristianGruen commented 4 months ago

I like the approach to make the XPath/XQuery choice an option. I would definitely appreciate a syntactically lightweight approach, though, and let the implementation take charge of the organisation of compiled queries. This would still be my favorite solution for invisible-xml as well – users don’t benefit from taking charge of those issues.

With our xquery:eval function, users can currently do things as simple as:

(: with the empty string as key, the value will be bound to the context value :)
let $doc := <xml><a/><b/></xml>
for $name in ('a', 'b')
return xquery:eval($name, { '': $doc })

Some more aspects:

We have added options to limit the runtime and (approximate) memory consumption of the query.
If an xs:anyURI is specified as input, the referenced file will be retrieved and evaluated.
If the query raises an error, and if a pass option is enabled, the base URI and line/column information of the evaluated query will be returned in the error.
For updates, we have an (updating) xquery:eval-update function.

I think the challenge is to define a function that’s powerful enough to convince users not to stick with vendor-specific solutions.

michaelhkay commented 4 months ago

We use caching to avoid repeated compilation of xsl:evaluate, and the problem is that you can use a lot of memory caching unnecessarily. In addition, it's not always clear when caching is safe; for example when the query depends on the static context of the caller. The two-phase approach seems to me to offer users more control without imposing a lot of complexity.

I would suggest leaving error handling and XQuery Update out of it initially.

ChristianGruen commented 4 months ago

We use caching to avoid repeated compilation of xsl:evaluate, and the problem is that you can use a lot of memory caching unnecessarily.

I think that’s a very common challenge in caching – but also an old and classical one, for which there are numerous solutions. One similar example, recently discussed, is the creation of an ad hoc hash join to speed up repeated lookups; another example is the very basic fn:doc function. For the implementer, it is certainly safer and easier to let the user do the work. For the user, it’s just the other way around. I imagine that in the long term we can provide better solutions if we take over this task.

In addition, it's not always clear when caching is safe; for example when the query depends on the static context of the caller.

I wonder if the user will always make the right decision if the implementation can’t?

I would suggest leaving error handling and XQuery Update out of it initially.

Makes sense.

gimsieke commented 4 months ago

Public functions in the calling module should probably be made available automatically, in which case we don't need to support "import module".

Being able to evaluate user-defined XQuery functions dynamically is important, therefore I don’t like import module to be excluded from the supported subset. Use case: BaseX jobs:eval() in our equation image renderer.

michaelhkay commented 1 month ago

See also #1329, which proposes an alternative approach, allowing load-query-module() to accept input from a string. This allows an arbitrary XQuery expression to be created as a string, placed in a "declare function" wrapper, loaded as a query module, and then invoked as a dynamic function. While a bit long-winded, this can be done with very little new specification machinery.

qt4cg / qtspecs

New function fn:query() #1194