BaseXdb / basex

BaseX Main Repository.
http://basex.org
BSD 3-Clause "New" or "Revised" License
684 stars 265 forks source link

Add support for fn:invisible-xml #2192

Closed GuntherRademacher closed 1 year ago

GuntherRademacher commented 1 year ago

Add support for fn:invisible-xml

This change set adds fn:invisible-xml per Mike Kay's proposal (see Support Invisible XML by @michaelhkay). The implementation is based on Norm Walsh's CoffeeFilter and CoffeeGrinder projects (thank you, @ndw, for providing them).

Coffee*er jars are not planned to be delivered with BaseX, rather the user will have to take care of putting them in the runtime classpath. The implementation checks for the availability of the immediately referenced classes, and issues an error message in case they are unavailable. The code however will be compiled against those jars. In pom.xml, they are tagged as optional in scope runtime, and in basex-core/pom.xml the scope is provided, in order to have them available at compile time. With this setup, the jars will not be copied to lib folders and they will not appear in release artifacts.

Ixml parser generation is done in FnInvisibleXml.java:67:

      final InvisibleXmlParser parser = new InvisibleXml().getParserFromIxml(grammar);

Here, a new InvisibleXml object is created upon each invocation, though I think that it might be sufficient to have a single static instance of InvisibleXml. However after processing the first invalid grammar, all further parses will write some logging. I believe that this is not the intended behavior, so have asked @ndw to change it in nineml/coffeefilter#80.

The documentation should contain something like the following (though I am not sure where to put it, as I think XQuery 4 functions aren't documented yet, either):

fn:invisible-xml

Signature:

fn:invisible-xml(
  $grammar as xs:string
) as (function(xs:string) as document-node())

Summary:

The function takes as input a string defining an invisible XML grammar in ixml format, and returns as output a function that can be used to parse strings conforming to that grammar, converting them into XDM document nodes.

Errors:

Failed to parse ixml grammar: could not match % at line %, column %.

Failed to generate ixml parser: %

Failed to parse ixml input: could not match % at line %, column %.

Failed to process ixml parser result: %

Prerequisites:

The implementation of fn:invisible-xml relies on the org.nineml:coffeefilter and org.nineml:coffeegrinder projects. Their jars are not delivered with BaseX, but must be put into the runtime classpath explicitly. If one of their referenced classes is unavailable, an error message will be generated:

Function invisible-xml requires missing class: org.nineml.%

ChristianGruen commented 1 year ago

The documentation should contain something like the following (though I am not sure where to put it, as I think XQuery 4 functions aren't documented yet, either):

Thanks, that’s helpful. Before BaseX 11 is released, I’ll add an article about the current state of XQuery 4 (and BaseX-specific details) in our documentation, similar to the one for XQuery 3.1.

The official documentation of the function will also be added to the XQFO 4.0 Specification.

ndw commented 1 year ago

If I can repackage the NineML jars in a way that's more convenient for you, please let me know.

ChristianGruen commented 1 year ago

@ndw That’s kind. It’s pretty convenient already to get the two jar files embedded. The main reason why we plan to make the library optional for the moment is that we want to keep our core library as small as possible.