Schematron / schematron

Schematron "skeleton" - XSLT implementation
MIT License
93 stars 45 forks source link

can any XSLT element be used in Schematron? #42

Open rvdb opened 7 years ago

rvdb commented 7 years ago

Apologies if this is not the right place for questions, but I couldn't find a better place. I am writing an ISO Schematron file for validating XML files with the Oxygen XML editor, which is making use of the Skeleton implementation. In doing so, I've noticed how any XSLT elements I throw into the Schematron are just processed without any complaint or error. I'm very happy with this as well, but would like to know whether or not this is a stable feature I can rely on.

If I'm reading the spec correctly, Annex H: Query language binding for XSLT 2 only explicitly lists following XSLT instructions:

Yet, I've created a test Schematron file in this gist that just retrieves the value of an <xsl:variable>, via <xsl:for-each>, <xsl:call-template>, and <xsl:function>. Even <xsl:include> can be used to import external XSLT stylesheets in a Schematron file.

Hence, my question: can I safely assume that the use of XSLT elements in Schematron is just supported by the Skeleton implementation? If so, is it documented anywhere?

tgraham-antenna commented 7 years ago

I don't know that it is documented. @rjelliffe would have to tell you about its future stability.

rjelliffe commented 7 years ago

Yes, the Query Language Bindings for XSLT and XSLT2bdoes allow specific XSLT elements, so every implementation of the QLB must support them.

However, the schema for Schematron is 'open' for foreign elements, meaning you can extend it however you wish in certain places. The price of using elements not specified in the QLB is that there is no guarantee that they will work in a different implementation of the same QLB.

Yes, you can assume that if you compile and run your schema on an XSLT2 system using the stylesheet, including foreign XSLT2 elements that are valid against the Schematron schema, thye stylesheet should compile and should run with appropriate scoping, if it does so now. I have had no requests to reduce or enhance this behaviour.

However, I strongly urge you not to use xsl:template or for:each in the stylesheet. The more that you extend Schematron, the less vs I.e it has, and you will just end up with an eccentric difficult-to-maintain XSLT script.

A better approach is layering. Is your code really to do with finding patterns, or reporting on them?

Here are three suggestions which may or may not be relevant:

  1. If the problem is that the patterns are highly complex, consider preprocessing the input document with xslt to add extra attributes or PI s with hints or keys or links or counts or labels. The decorator pattern. Then your schema has all the information it needs when it needs it, and YouTube given names to underlying characteristics.

  2. If the problem is that you need to report in some more complex format, convert the SVRL into a stylesheet with templates matching the paths in each failed assertion and run that on the input. The visitor pattern.

  3. If you need to do this for multiple different schemas, use the skeleton API to make a custom application so that you don't need to complexity the schema.

If you do need to use other XSLT, then hide them in a function as much as possible.

However, very often people use XSLT elements when they should be first asking "can I do this in XPath, with svh:let variables?" I saw a Schematron schema with assertions that just called was Java once functions: that is what the developer knew.

rvdb commented 7 years ago

Many thanks for your thoughts and suggestions! It's reassuring to know that this can be considered a stable feature in the Skeleton implementation. My actual use case is rather pragmatic: inside my Schematron patterns, I'm embedding Schematron Quick Fixes in order to suggest automatic corrections to the source document for errors flagged in the corresponding Schematron pattern. One common operation is wrapping existing content in a new XML element, by copying it with the SQF <sqf:copy-of/> element. Yet, in doing so, all default attributes defined in the source document's schema are being explicitly added to the copied elements, which might even confuse the human editors of the document more than help them.

Therefore, I've experimented by creating a named XSLT template (could be a function as well) that applies a modified identity transformation with a couple of <xsl:template> instructions that suppress the unwanted default attributes from the copied result. This works fine, and by separating these pure XSLT operations in a separate XSLT stylesheet that I'm importing into the Schematron file via <xsl:import/>, this provides a mechanism to keep the Schematron file clean of XSLT 'clutter'.

rjelliffe commented 7 years ago

Yes, that is a shortcoming in the XML data model and the XPath data model. Constrast it with the old OmniMark SGML processor, which let you test if an attribute was implied or specified. I have mentioned this to Michael Kay, and he suggested I put it to the XSLT WG.

But I don't see there is any chance of success: XPath implementations rely on standard XML libraries and there is no way XML will change to add this, unfortunately. (It could go into the DOM as user data, I suppose.)

So the only way to do it is to post-process the data.

Rick

On Fri, Mar 31, 2017 at 7:43 AM, rvdb notifications@github.com wrote:

Many thanks for your thoughts and suggestions! It's reassuring to know that this can be considered a stable feature in the Skeleton implementation. My actual use case is rather pragmatic: inside my Schematron patterns, I'm embedding Schematron Quick Fixes http://www.schematron-quickfix.com/ in order to suggest automatic corrections to the source document for errors flagged in the corresponding Schematron pattern. One common operation is wrapping existing content in a new XML element, by copying it with the SQF http://www.schematron-quickfix.com/quickFix/reference.html#sqf_copy-of element. Yet, in doing so, all default attributes defined in the source document's schema are being explicitly added to the copied elements, which might even confuse the human editors of the document more than help them.

Therefore, I've experimented by creating a named XSLT template (could be a function as well) that applies a modified identity transformation with a couple of instructions that suppress the unwanted default attributes from the copied result. This works fine, and by separating these pure XSLT operations in a separate XSLT stylesheet that I'm importing into the Schematron file via , this provides a mechanism to keep the Schematron file clean of XSLT 'clutter'.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron/issues/42#issuecomment-290539032, or mute the thread https://github.com/notifications/unsubscribe-auth/AX3VKTsTeoaSQj4PL-qCTGDXDNAuB8Adks5rrBP5gaJpZM4McYE4 .

chansey97 commented 6 years ago

Hi @rjelliffe, "If you do need to use other XSLT, then hide them in a function as much as possible." If I have some util functions in XSLT file. Can I safely import stylesheet in schematron by using xsl:include/xsl:import? (I found that xsl:import doesn't work, but xsl:include can.) Thanks.

rjelliffe commented 6 years ago

Whether it works or not depends on the engine and implementation not the standard. The standard allows foreign elements, but whether or how an implementation supports any particular elements is not part of the standard. That is the deal with an "open" schema: the language says "yes you may" but it does not imply "yes it will work": you have to try it to see.

As to whether the skeleton implementation supports /sch:schema/xsl:include, I expect it would. But you have to write your XSLT code knowing what the XSLT which the Schematron compilation produces expects. This is particularly true of namespaces: you always should use sch:ns not rely on xmlns:XXX.

The difference between sch:include/sch:import and xsl:include/xsl:import as macros, is that the former is performed at Schematron compile time, and the latter is performed at XSLT compile-time (which is usually at runtime.)

Rick

On Wed, Jan 17, 2018 at 4:22 AM, chansey97 notifications@github.com wrote:

Hi @rjelliffe https://github.com/rjelliffe, "If you do need to use other XSLT, then hide them in a function as much as possible." If I have some util functions in XSLT file. Can I safely import stylesheet in schematron by using xsl:include/xsl:import? (I found that xsl:import doesn't work, but xsl:include can.) Thanks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron/issues/42#issuecomment-358037027, or mute the thread https://github.com/notifications/unsubscribe-auth/AX3VKc5c3sYL6YDRSBh9G6_2H1edJvfCks5tLNq9gaJpZM4McYE4 .