Schematron / schematron-enhancement-proposals

This repository collects proposals to enhance Schematron beyond the ISO specification
9 stars 0 forks source link

Setting up the query language environment #4

Open dmj opened 3 years ago

dmj commented 3 years ago

The ISO specification allows the XSLT elements function and key to be used in a Schematron schema. This makes sense because both are required to set up the query language environment. The xsl:key element prepares data structures for the fn:key() function and the `` xsl:function``` element allows for the use of user defined functions.

In the same vein Schematron should allow other XSLT elements to be used. Namely:

xsl:include, xsl:import, xsl:use-package

These three instructions are used to load user defined function libraries.

xsl:import-schema

The instruction is used to load type information.

xsl:accumulator

Defines the data structures for the fn:accumulator-before() and fn:accumulator-after() functions.

Changes to the text of the 2020 specification:

Add the following sentence to the default query language specification in Annex A:

Add the following sentences to the XSLT 2.0 query language specification in Annex H:

Add the following sentences to the XSLT 3.0 query language specification in Annex J:

nigelwhitaker commented 3 years ago

Hello,

I'm in complete agreement with this issue!

Indeed, I was doing precisely this about three years ago without even realizing that it wasn't allowed - as far as I can remember I've never looked at Annex J!

Here's the schematron I wrote - see for example the last rule:

https://github.com/nigelwhitaker/cals-table-schematron/blob/e4e8d4280606f1663f2ac0d412926296ff2c241c/source/cals.sch#L126

My way of making it work was a fork and a few changes on this branch:

https://github.com/DeltaXML/schematron/tree/issue-20a-xslt3

rjelliffe commented 3 years ago

I think the 2020 standard now defines a standard binding for xpath3 and xslt3.

A reserved binding name does not mean you cannot use the technology it names. It is an advice that there is likely to be an official qlb coming out, so if you use the reserved name now, you may have an incompatability then.

It says: please avoid this name until it is standardised (unless you can go and change the name in your schemas yourself if needed.) XSLT3 is so big that it was not clear which features a future QLB would support. For example, would it include XML Schema typing or just be a "basic" processor?

Using a non-standard QLB (e.g. xslt3-basic might be a good name for typeless XSLT) does not make your schematron schema or implementation non-conforming, provided the QLB is documented.

To put it another way, the query binding name is not so much a general purpose thing to say "I need XSLT4 or whatever" it is more a nickname for some Query Language Binding document that answers specific questions about what facilities that binding provides.

But you right that if there is a reserved name in one version of the standard that the committee decides not to make a QLB for, that name should get unreserved.

Cheers Rick

On Tue, 2 Mar 2021, 12:58 am David Maus, @.***> wrote:

The ISO specification allows the XSLT elements function and key to be used in a Schematron schema. This makes sense because both are required to set up the query language environment. The xsl:key element prepares data structures for the fn:key() function and the `` xsl:function``` element allows for the use of user defined functions.

In the same vein Schematron should allow other XSLT elements to be used. Namely: xsl:include, xsl:import, xsl:use-package

These three instructions are used to load user defined function libraries. xsl:import-schema

The instruction is used to load type information. xsl:accumulator

Defines the data structures for the fn:accumulator-before() and fn:accumulator-after() functions. Changes to the text of the 2020 specification:

Add the following sentence to the default query language specification in Annex A:

  • The XSLT1 elements import and include may be used, in the XSLT1 namespace, before the pattern element.

Add the following sentences to the XSLT 2.0 query language specification in Annex H:

-

The XSLT2 elements import and include may be used, in the XSLT2 namespace, before the pattern element.

The XSLT2 element import-schema may be used, in the XSLT2 namespace, before the pattern element.

Add the following sentences to the XSLT 3.0 query language specification in Annex J:

-

The XSLT3 elements import, include, and use-package may be used, in the XSLT3 namespace, before the pattern element.

The XSLT3 element import-schema may be used, in the XSLT3 namespace, before the pattern element.

The XSLT3 element accumulator may be used, in the XSLT3 namespace, before the pattern element.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron-enhancement-proposals/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF65KKNEBHOQ5MQQKNCXF6TTBOMOVANCNFSM4YMSZH5A .

rjelliffe commented 3 years ago

Apologies.

My previous response was for #6.

My suggestion for #4 is that adding type awareness is a big deal, as far as the implementation goes, especially one that is not based on SAXON.

So I would suggest that there be "xslt2-basic" and "xslt3-basic" for non type-aware (schema using) Schematron, if you want to allow XML schemas in the simple binding. Or, alternately xslt2-typed and xslt3-typed for the typed version.

Regards, Rick Jelliffe

On Sat, 22 May 2021, 9:18 am Rick Jelliffe, @.***> wrote:

I think the 2020 standard now defines a standard binding for xpath3 and xslt3.

A reserved binding name does not mean you cannot use the technology it names. It is an advice that there is likely to be an official qlb coming out, so if you use the reserved name now, you may have an incompatability then.

It says: please avoid this name until it is standardised (unless you can go and change the name in your schemas yourself if needed.) XSLT3 is so big that it was not clear which features a future QLB would support. For example, would it include XML Schema typing or just be a "basic" processor?

Using a non-standard QLB (e.g. xslt3-basic might be a good name for typeless XSLT) does not make your schematron schema or implementation non-conforming, provided the QLB is documented.

To put it another way, the query binding name is not so much a general purpose thing to say "I need XSLT4 or whatever" it is more a nickname for some Query Language Binding document that answers specific questions about what facilities that binding provides.

But you right that if there is a reserved name in one version of the standard that the committee decides not to make a QLB for, that name should get unreserved.

Cheers Rick

On Tue, 2 Mar 2021, 12:58 am David Maus, @.***> wrote:

The ISO specification allows the XSLT elements function and key to be used in a Schematron schema. This makes sense because both are required to set up the query language environment. The xsl:key element prepares data structures for the fn:key() function and the `` xsl:function``` element allows for the use of user defined functions.

In the same vein Schematron should allow other XSLT elements to be used. Namely: xsl:include, xsl:import, xsl:use-package

These three instructions are used to load user defined function libraries. xsl:import-schema

The instruction is used to load type information. xsl:accumulator

Defines the data structures for the fn:accumulator-before() and fn:accumulator-after() functions. Changes to the text of the 2020 specification:

Add the following sentence to the default query language specification in Annex A:

  • The XSLT1 elements import and include may be used, in the XSLT1 namespace, before the pattern element.

Add the following sentences to the XSLT 2.0 query language specification in Annex H:

-

The XSLT2 elements import and include may be used, in the XSLT2 namespace, before the pattern element.

The XSLT2 element import-schema may be used, in the XSLT2 namespace, before the pattern element.

Add the following sentences to the XSLT 3.0 query language specification in Annex J:

-

The XSLT3 elements import, include, and use-package may be used, in the XSLT3 namespace, before the pattern element.

The XSLT3 element import-schema may be used, in the XSLT3 namespace, before the pattern element.

The XSLT3 element accumulator may be used, in the XSLT3 namespace, before the pattern element.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron-enhancement-proposals/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF65KKNEBHOQ5MQQKNCXF6TTBOMOVANCNFSM4YMSZH5A .

rjelliffe commented 2 years ago

I think there are four language design principles:

First, Schematron should only allows foreign elements related to the QLB if Schematron does not provide the elements itself.

Second, where an intended query language has several dialects, then there needs to be at least a QLB for the minimal dialect. The QLB should be enough to distinguish whether a particular engine (e.g. XSLT) supports the features needed.

Third, Schematron should not support conforming schemas where there are platform dependencies even with the same QLB.

Fourth, that if a QLB requires extra attributes (rather than extra elements) and does not provide namespaced versions of the attributes, then these are good candidates to extend Schematron with.

Therefore:

1) Schematron should not support xslt:import or xslt:include. Wed already have DSDL include, and XML's include and Schematron's include. A standard schema must be portable between implementation, which means it cannot resort to arbitrary code pulling things in from some host language: the more that XSLT elements are allowed the more that you may have a conceptual problem: for example, if you are using XSLT to process elements in the middle of making an assertion, then perhaps you should have run it to decorate the incoming document with extra information, or perhaps you should have split your Schematron schema into two so that the second extracts information and the second validates the document with access to the SVRL information of the first.

2) So for XSLT3, it is clear that the no-schema dialect, the with-schema dialect, and the streaming dialect are all different things. So they each need a distinct QLB. E.g, xslt3-basic, xslt3-typed, and xslt3-streaming. (It is not clear that there are XPath engines implementations that use Xpath 3.0 rather than 3.1, so that does not seems a dialect issue.) If I am given a Schematron schema, and it says the QLB is xslt3-typed, I can check whether my implementation supports it before running it and finding that it infuriatingly fails.

3) I worked at an organization once, repairing Schematron schemas which were all of the kind

where they then implemented all their tests in java. So this schema was utterly non-portable to run on some other platform.

4) RDF made the mistake of thinking that if an element had a qualified name, unqualified attributes could be transferred as if they were qualified with the same namespace. This was bogus. So for XSLT 3, to support xsl:variable/@as it is not appropriate to have sch:let/@xs:as. Instead Schematron should define sch:let/@as as part of standard Schematron, and alter the requirements for QLB definition document to state that it should say whether and what the atribute is used for and contains.

tgraham-antenna commented 2 years ago

...

  1. Schematron should not support xslt:import or xslt:include.

I think that ship has sailed, and it's now time to pave the cowpaths. (To really mix my metaphors!)

Actually, I'd be happy to ban xsl:include but allow xsl:import because an imported file explicitly has lower precedence and because (in XSLT) it's required to be at the top of the importing file.

People have used xsl:import because it's convenient and because it makes it possible to separate out the messy implementation details of setting up for Schematron to test something. For example, @nigelwhitaker's cals-table-functions.xsl hides the details of things like getting from a table cell to its colspec. Not the sort of thing I'd want to have to scroll past every time that I look at a Schematron file.

... for example, if you are using XSLT to process elements in the middle of making an assertion, then perhaps you should have run it to decorate the incoming document with extra information, or perhaps you should have split your Schematron schema into two so that the second extracts information and the second validates the document with access to the SVRL information of the first.

focheck uses a 1,500-line REx-generated parser in XSLT 2.0 ^1 for parsing XSL-FO property value expressions for checking with Schematron. There's also a 600-line XSLT file ^2 for reducing the parse tree into something manageable. Annotating XSL-FO markup with even the reduced parser output would make the XSL-FO even less readable than it already is for no real benefit compared to using a function in the actual Schematron. Rewriting the REx-generated parser into something that can legally be embedded in a Schematron file isn't my idea of a good time, and having to scroll past all of that to see the actual Schematron rules every time I open the file is even less my idea of a good time.

tgraham-antenna commented 2 years ago

Also, focheck automatically runs full-time when editing an XSL-FO file in Oxygen, which couldn't happen if it was a two-stage process.

rjelliffe commented 2 years ago

(Markdown crapulosity too much: deleting and re-adding below.)

rjelliffe commented 2 years ago

Also, focheck automatically runs full-time when editing a Schematron file in Oxygen, which couldn't happen if it was a two-stage process. ?? The two passes can be done in the same stage.  We don't have the XSLT 1 limitation.

cals-table-functions.xsl hides the details of things like getting from a table cell to its colspec.

A serious proposal: lets just get rid of ALL xslt:* foreign elements in Schematron!  (Remembering that XPath3 allows functions and variables inside XPaths, so the need for foreign xst: elements is less for users of XPath3.) 

I am moving this to a new issue.

Rick

Arithmeticus commented 2 years ago

@rjelliffe Provide link to that new issue?

tgraham-antenna commented 2 years ago

@rjelliffe Provide link to that new issue?

45, "Enhance sch:let to support functions, accumuators and keys"

AndrewSales commented 9 months ago

@dmj 's proposed changes incorporated into latest draft.