Schematron / schematron

Schematron "skeleton" - XSLT implementation
MIT License
93 stars 45 forks source link

Enhancement: #all phases but stop when one fails #21

Open rjelliffe opened 7 years ago

rjelliffe commented 7 years ago

Nigel Whitaker on Schematron-love-in mail group has requested a variant on #ALL where when one phase fails, other phases are not validated. This does not require any extra syntax in the schema, it would be a command-line implementation switch.

Nigel gives code below. I think a more general implementation would be a nested invocation so that instead of generating:

It generates something like ... ditto for M13 abd M14 ------- Nigel's comments: Over a year ago I wrote about an issue I was having with progressive validation. George and Rick provided useful replies which I pondered for far too long - sorry! But recently the issue was rekindled because we've had both enhancement requests from some of our customers and also external requests to get the code finished and released (with an Open Source license). The code I'm talking about is now here: http://code.google.com/p/cals-table-schematron/ It checks, as far as I'm aware, all of the semantic rules for CALS tables. George suggested a pattern for including or repeating patterns in the phases. In my code it looks like this: This works, in that it does get things in the right order. So if I do or by passing the phase param on the command-line I get the desired order. I looked at the generated code and found that the skeleton generates the same code as phase=#ALL which runs the phases in document order. So in both cases I end up with this generated code for running the four phases (you can get additional code with some skeleton params): The other problem was stopping after one phase failed (because the following phases were written assuming things like referential integrity - eg: entry/@colname references pointing to a single colspec in CALS). As it currently stands users can end up seeing things like this: "An empty sequence is not allowed as the first argument of cals:colnum()" which is an internal function used for structural checking as I've assumed that when doing structural checking that all references can be resolved. I thought about three possible ways of implementing the "stopping after a failed phase" behaviour: I did consider XProc, it may be possible to define a step that runs each phase (perhaps from a 'phases' param/option that specified the ordered list of phases) with p:xslt and looks at the SVRL to check for failures. I didn't go this route because (a) I lacked XProc skills and (b) I was concerned about the performance issues, for 4 phases I couldn't see how to avoid 4 XSLT transforms and thus compiling the generated XSLT 4 times and also parsing the instance file being checked 4 times. Another technique I considered, primarily for performance was to write a multi-phase-runner in Java using Saxon s9api. This would compile the XSLT once, load the input XML using a DocumentBuilder into an XdmNode and run 4 transformations, checking the XdmNode result by running an XPath query on the SVRL from the 4 runs. In the end the technique we're currently using (in some of our software) involves modifying the XSLT generated by the XSLT2 skeleton. So the example above would become: ... And the code for each assertion then sets the 'variable' with the addition of the saxon:assign statement, in the example below: The colnames of the colspecs in a tgroup () must be unique CALS-T10R4B It's using saxon:assign and that's bypassing functional XSLT which I'm not happy with or proud of! I think it's also a PE/EE feature of Saxon and won't work in HE. So while my XSLT hacking works for our purposes I'm not sure it would be an acceptable contribution to the main skeleton code? I did try Rick's suggestion of xsl:message/@terminate='yes' but that was equally messy as it terminated the XSLT process and you didn't necessarily get any SVRL output from the failed phase.