Schematron / schematron-enhancement-proposals

This repository collects proposals to enhance Schematron beyond the ISO specification
7 stars 0 forks source link

Add postconditions for variables and contexts: @post-condition #73

Open rjelliffe opened 3 months ago

rjelliffe commented 3 months ago

(Added: In my Schematron users meeting presentation [Prague 2024] I identified this as proposal as one of the most important IMHO.)

It can be hard, especially for newcomers or rare users, to have confidence that a complex XPath is working the way it should. Indeed, as a matter of good software engineering, the more important ("risky") some code is, the more that you want to have some independent (i.e., redundant) check of it. This is of course well-known since Bertram Meyer, and a rationale for Schematron itself.

So would Schematron be better if it allowed internal assertions on its own Xpaths? I think so, and I think it can be trivially implemented (over XSLT) without neutralizing optimized-lazy evaluation. It would complement e.g. sch:let/@as, which allows a level of typing.

In concrete terms the proposal is that sch:let and sch:rule allow another attribute @post-condition which takes an Xpath expression that evaluates to boolean. The context for this XPath is the variable value or the rule context.

The evaluation of the @post-condition would not go into the SVRL (necessarily): the document's vaidity result is unchanged whether or not these post-conditions are enabled or not. It is intended to for developer information, confidence and debugging not for the end-user of the schema. It would generate implementation-dependent information e.g. on Standard Error output (e.g. xsl:message) or to a log file or for an IDE.

Here are two examples:

<sch:rule context="*[@id]"   post-condition="string-length(normalize-space(@id)) ne 0"  > ...

This example a rule select all elements that have an @id attribute. However, the developer expects that these all contain non-empty values: the post-condition makes this explicit. We don't want to use sch:assertions for this, because it is a programmer-world thing not a user-world thing.

<sch:let name="post-code-list" value="document('post-codes.xml')"  post-condition="/post-codes[@version='2024']"  />

In this, the document is read in. (And any exceptions are swallowed, or logged.) Then the condition is tested. If there was no document or the wrong one, the post-condition will fail and the failure logged. The implementation can warn the user there has been this problem (e.g. in this pattern) and not produce a result of "valid".

This is a partial fix for the problem that XPath functions can generate exceptions, but Schematron has no mechanism to cope. For example, if trying to parse a number and it is not a number, we put the code into a variable first. The parse fails and generates an exception which is swallowed or fails. Then we check the value using @post-condition so that we are not beholden to the way the engine implements exception handling.

<sch:let name="my-safe-number"  value="number(/*/@some-code)"  post-condition="number(.)" />

Another example: for helping with complex chains of variables:

<sch:let name="var1" value="//thing" />
<sch:let name="var2" value="$var1/child::*[1]" />
<sch:let name="var3" value="$var2/child::*[1]"  post-condition="count(.) = count($var1)"   />

which might be implemented as:

<xsl:variable name="var1"                         select="//thing" />
<xsl:variable name="var2"                         select="$var1/child::*[1]" />
<xsl:variable name="var3-23423423420" select="$var2/child::*[1]"  />
<xsl:variable name="var3"> 
        <xsl:if test="$var3-23423423420/count(.) = count($var1)">
              <xsl:message>Post-condition failed: .... </xsl:message>
          </xsl:if> 
       <xsl:copy-of select="$var3-23423423420" />
</xsl:value>  

(Not debugged. You get the idea. The double handling of var3 is to maintain lazy-evaluation.)

In this case, the developer believes it to be the case that every "thing" has a grandchild element, which simplifies the cases they need to make assertions for. But the developer wants to be able to check this during testing, and not make it something that invades the user's diagnostics. (They could do this using a dedicated phase too, if they wanted full diagnostics, but they might find that bad separation of concerns in their specific scenario.)

Regards Rick

rjelliffe commented 3 months ago

Alternate names to @post-condition might be @confirm or @expect or @assume.

AndrewSales commented 3 months ago

I see this proposal as relating to two slightly different, but related, things: unit testing and exception handling.

There are mature testing frameworks, such as XSpec, where this kind of thing can already be accommodated. It's good practice and probably better for the programmer to amass a set of test cases that cause exceptions to be raised.

If you are worried about the exception handling provided by the implementation you are using, you can write your own function and handle exceptions (differently - perhaps more gracefully) there. XSpec can also test if your functions are working correctly, of course.

rjelliffe commented 3 months ago

I think it relates to more than unit testing and exceptions.

  1. Unit testing is based on coarse tests where you run a canned representative example to get expected results: after the first run , their aim becomes not so much proving "does this work" as detecting "have I broken something that worked before?". Unit tests work well in scenarios where there is little variation or surprise or combinatorial explosion in inputs, and are strictly test-time things. In contrast, post-conditions are useful in the opposite situation, where the variety of input means you need to do either exhaustive tests (testing internal invariants rather than exteral units) or to never disable the tests until after the system is mature (what the QA people call "quality-in--use". )

And post-conditions work at a different scale than unit tests, e.g. at variable scale. The external "unit" of Schematron is the assertion: variables or contexts cannot be checked, except by adding assertions for the purpose. Which then means you need to put in place a mechanism to shield these assertion fails from the user, or to turn them on and off.

To me, just as you would not say (in general) that we dont need Schematron when we can use unit tests, I think we cannot say (in general) that post-conditions can be replaced by unit-tests over tricky Schematron schemas.

As I mention, when you have combinatorial explosion (e.g the standard case of rich text such as legal publishing) a large set of test cases gives a false sense of security. I have worked on several systems where we had to test against all previous inputs (tens of thousands of documents) and even then would find uncoped-with scenarios in the next incoming set if documents. (Typically where a new data source had bern integrated way up the line, perhaps in a different country by a different team.)

  1. I don't think that writing functions to cope with exceptions may be always workable, because it introduces complication in the very place where the exceptions are: instead of a standard method that an IDE can integrate, every schema is potentially different. And there are certainly developers who understand XPaths well enough, but not function definition: Schematron needs to provide a value-add over XSLT to be viable.

My original thought was indeed to provide a function safe-number() which would provide more possibilties for handling NaN exceptions, but it risked being a band-aid. That being said, there might be some better approach to exceptions: which are in particular file-not-found exceptions and NaN exceptions, in my experience.

But I think we do need to check post-conditions as close to the variable declaration as possible. We want to easily see what the developer decided was not necessary to cope with when they wrote their XPaths: that they believed the value of one variable would have the same number of items as the value of the variable it was using as an input, for example.

For the syntax: I thought of allowing sch:let/sch:assert instead, only tested when the variable was used: it is fine by me too, but I thought @post-condition was less intrusive. Using an element or attribute here is not critical.

@as goes some way (even though it is perhaps more really needed for type coercion or to prevent taking of values) but it does not cope with co-occurrence constraints, which @post-condition does.

To put it another way, wherever any language is used for mission-critical operations with any complexity (either complex processing or widely-varying inputs) you need to add redundant checks at the level of granularity of the risk.

Regards Rick

On Tuesday, May 14, 2024, Andrew Sales @.***> wrote:

I see this proposal as relating to two slightly different, but related, things: unit testing and exception handling.

There are mature testing frameworks, such as XSpec https://github.com/xspec/xspec/wiki, where this kind of thing can already be accommodated. It's good practice and probably better for the programmer to amass a set of test cases that cause exceptions to be raised.

If you are worried about the exception handling provided by the implementation you are using, you can write your own function and handle exceptions (differently - perhaps more gracefully) there. XSpec can also test if your functions are working correctly, of course.

— Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron-enhancement-proposals/issues/73#issuecomment-2108527207, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF65KKORWPG2CCMQQBA7243ZCEAUFAVCNFSM6AAAAABHOFLEWKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBYGUZDOMRQG4 . You are receiving this because you authored the thread.Message ID: @.*** com>

AndrewSales commented 3 months ago

Then we disagree fundamentally about the purpose of unit testing. The idea of disabling the tests once the system is mature alarms me, since in my day job I am dealing with unpredictable, human-authored input that can vary greatly. We write our last test when we have fixed the last bug.

I have worked on several systems where we had to test against all previous inputs (tens of thousands of documents) and even then would find uncoped-with scenarios in the next incoming set if documents.

Me too, and I continue to. It is the way of things, which this proposal can't change.

I don't think what you propose is a bad idea, just that it doesn't solve the problem. I think it is a problem in any case that can only be mitigated. You open with the challenges of a complex XPath, but that post-condition XPath is only going to get more complex as it needs to accommodate more scenarios. It would provide a sense of security no more true than corresponding unit tests would. I would, as I say, address this with additional test cases to describe unforeseen scenarios as they arise, and amend the schema to reflect them as needed.

This is a partial fix for the problem that XPath functions can generate exceptions, but Schematron has no mechanism to cope.

Well, there is if...then...else... error(...) approach, but the standard discourages the use of error(). Perhaps we need some runtime linkage that does allow user-defined exceptions to be handled by the implementation and consistently reported as SVRL...

rjelliffe commented 3 months ago

A unit test says "given some specific input X expect specific output Y". An assertion says "for every possible A, some invariant B should hold." Not the same things.

An assertion simplifies coding and understanding by carving off situations that are not expected to occur. A unit test is a sanity check that some function has produced a plausible result .

Rick

On Tue, 14 May 2024, 04:46 Andrew Sales, @.***> wrote:

Then we disagree fundamentally about the purpose of unit testing. The idea of disabling the tests once the system is mature alarms me, since in my day job I am dealing with unpredictable, human-authored input that can vary greatly. We write our last test when we have fixed the last bug.

I have worked on several systems where we had to test against all previous inputs (tens of thousands of documents) and even then would find uncoped-with scenarios in the next incoming set if documents.

Me too, and I continue to. It is the way of things, which this proposal can't change.

I don't think what you propose is a bad idea, just that it doesn't solve the problem. I think it is a problem in any case that can only be mitigated. You open with the challenges of a complex XPath, but that post-condition XPath is only going to get more complex as it needs to accommodate more scenarios. It would provide a sense of security no more true than corresponding unit tests would. I would, as I say, address this with additional test cases to describe unforeseen scenarios as they arise, and amend the schema to reflect them as needed.

This is a partial fix for the problem that XPath functions can generate exceptions, but Schematron has no mechanism to cope.

Well, there is if...then...else... error(...) approach, but the standard discourages the use of error(). Perhaps we need some runtime linkage that does allow user-defined exceptions to be handled by the implementation and consistently reported as SVRL...

— Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron-enhancement-proposals/issues/73#issuecomment-2110437790, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF65KKKOQBLEY7FBI2EBQSLZCIPSPAVCNFSM6AAAAABHOFLEWKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJQGQZTONZZGA . You are receiving this because you authored the thread.Message ID: @.*** com>

AndrewSales commented 3 months ago

A unit test says "given some specific input X expect specific output Y". An assertion says "for every possible A, some invariant B should hold." Not the same things.

I'm well aware of the difference.

As I said above, I don't think this kind of assertion addresses the issue of unpredictable input.

Assertions in other languages can typically be enabled or disabled at execution time, and if enabled, will often halt processing. If we do have assertions, I think implementations ought to be configurable in this respect.

A common case I've come across is a runtime error where an atomic value was expected by a function, but a sequence was passed instead. This can occur also e.g. in message construction, with <value-of/>. Would we want assertions in such places too?

I think it would be good to refine the expected behaviour and prospective reporting of errors, if this is to be standardised.

I'd be interested in input from the wider community about this as a feature. XML Prague and the Schematron Users Meetup are around the corner, which is one suitable forum.

rjelliffe commented 3 months ago

On Wed, 15 May 2024, 04:52 Andrew Sales, @.***> wrote

As I said above, I don't think this kind of assertion addresses the issue of unpredictable input.

It probably depends on the kind of unpredictability, a la Donald Rumsfeld. But I still don't understand Andrew's point, sorry, unless he is saying a developer using this may not cover all cases, or be a matter of discipline: that's life, isn't it?

Assertions in other languages can typically be enabled or disabled at execution time, and if enabled, will often halt processing. If we do have assertions, I think implementations ought to be configurable in this respect.

Certainly.

A common case I've come across is a runtime error where an atomic value was expected by a function, but a sequence was passed instead. This can occur also e.g. in message construction, with . Would we want assertions in such places too?

I think having post-conditions on sch:let and sch:rule is enough, but if there was utility in having it elsewhere, I dont see it would do harm. In the case of sch:value-of, someone might want pre-conditions as well as post-conditions.

Static type checking of signatures within Xpaths, like OxygenXml does, is a different issue, I think. That is where @as connects better.

I think it would be good to refine the expected behaviour and prospective reporting of errors, if this is to be standardised.

To an extent, yes. But the initial target users would be IDE and pipeline integrators (xproc, ant, etc).

As Andrew mentioned, it might be that @post-condition or @assert should not be a simple boolean, but e.g. allow error(). Or even allow it to generates a string as failure:

  @post-condition="if(*) then true()
        else 'Programming assumption not met: no child named-nodes."

I'd be interested in input from the wider community about this as a feature. XML Prague http://xmlprague.cz and the Schematron Users Meetup are around the corner, which is one suitable forum.

Good idea.

Rick

Message ID: @.*** com>

rjelliffe commented 3 months ago

Another approach, with different wins, would be to allow a new optional attribute on assertions: @to which is the subsystem or role that should be informed.

E.g. <sch:assert test=" parent::robin"
to="log" severity="fatal" role="developer_expectation">Oh dear</sch:assert>

Above, the assertion failure is logged, as well as going to SVRL.

 <sch:report test="preceding-sibling::coco" 
    to="mailto:fred@eg.com" severity="info"
    role="possible-version-violation">Found a preceding-sibling coco, most

likely this is old data that may be sourced in error.</sch:report>

Above, the assertion text gets emailed.

 <sch:assert test=" count($input-paras) eq count($output-paras)" 
     to="svrl"
     severity="error"  role="issue-for-devops">

Above, the assertion failure just goes to the SVRL (overriding any schr:rule/@to). Optional.

So the @to can have strings to integrate into the workflow. The SVRL also gets @to.

Simple validity continues as no unsuccessful assertions or succeeded reports, regardless of @severity and @to.

I would provide some reserved @to's:

log - message goes to log file error - message goes to std err out - message goes to std out (console) svrl - (default) owner - owner of process user - human mail:x - mail http:x - send to the URL in argument (ab)using HTTP GET post:x - send to URL x using HTTP POST - reserved put:x - send to URL x using HTTP PUT - reserved

AFAIK XSLT dos not allow PUT and PUSH, and just has GET (e.g. document()). So a @to with "http:..." would send the assertion failure to as an argument to GET, and the SVRL would include something from the HTTP response. This allows server-based interaction that bypasses the SVRL, but maintains an audit trail in the SVRL that the notification was received by the server.

Rick

On Mon, 13 May 2024, 08:26 Andrew Sales, @.***> wrote:

I see this proposal as relating to two slightly different, but related, things: unit testing and exception handling.

There are mature testing frameworks, such as XSpec https://github.com/xspec/xspec/wiki, where this kind of thing can already be accommodated. It's good practice and probably better for the programmer to amass a set of test cases that cause exceptions to be raised.

If you are worried about the exception handling provided by the implementation you are using, you can write your own function and handle exceptions (differently - perhaps more gracefully) there. XSpec can also test if your functions are working correctly, of course.

— Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron-enhancement-proposals/issues/73#issuecomment-2108527207, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF65KKORWPG2CCMQQBA7243ZCEAUFAVCNFSM6AAAAABHOFLEWKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBYGUZDOMRQG4 . You are receiving this because you authored the thread.Message ID: @.*** com>

AndrewSales commented 3 months ago

But I still don't understand Andrew's point, sorry, unless he is saying a developer using this may not cover all cases, or be a matter of discipline: that's life, isn't it?

I mean that the perceived utility of an assert will run out quickly for all but the most simple cases and most predictable input.

A real example from just the other day. I was testing out a new rule which worked in isolation but threw a divide-by-zero error when incorporated into the target schema. The cause was my test cases were toys that omitted otherwise required structures. The IDE was able to take me to the point of failure in the XSLT for the compiled schema.

Would an assert have helped me? Possibly, but I found the cause from my IDE anyway. Would I have wanted to put one everywhere in a sizeable schema where division was used? Probably not. If real-world input had caused this, I'd've added an extra condition to the relevant XPath and moved on. Here, I adjusted my test cases and moved on.

Static type checking of signatures within Xpaths, like OxygenXml does, is a different issue, I think. That is where @as connects better.

Not static: dynamic. Using string() or concat() for example, where an argument is a sequence because there is unexpectedly more than one of something that needs reporting in the message generated.

To an extent, yes.

It would be critically important to be clear how this feature would affect validity, if it all. I'm not asking for all of this information here and now, I am just noting the need.

tgraham-antenna commented 3 months ago

A unit test says "given some specific input X expect specific output Y". An assertion says "for every possible A, some invariant B should hold." Not the same things.

I'm well aware of the difference.

As I said above, I don't think this kind of assertion addresses the issue of unpredictable input.

I suggest that using 'assertion' as an unqualified term here risks confusion with sch:assert (at least for me).

Assertions in other languages can typically be enabled or disabled at execution time, and if enabled, will often halt processing. If we do have assertions, I think implementations ought to be configurable in this respect.

IME, the [programming language] assertions that can be disabled at execution time tend to be put just before the ordinary code that tries to do something reasonable with the same invalid value (e.g., return early with a null value) so that the developer gets the rude shock where the problem occurs and the user gets let down gently.

(I'm not sure how well the reasonable return value idiom translates to Schematron, where there aren't visible return values as such, just the presence or absence of messages from sch:assert and sch:report.)

It seems to me that @rjelliffe wants the Schematron to (also) be able to deliver the rude shocks, while @AndrewSales would (mostly) leave it to the unit tests. Plus, I think there's general agreement that there will always be one more bug when some user somewhere tries something unexpected. (Antenna House Formatter once had a bug with Latin superscripts in Bulgarian text. Who could have predicted that?)

It might be that myriad unit tests could all fail to exercise something that could be caught by checking a value that is calculated within the sch:rule. (At this point I don't know why you would do anything other than <sch:assert role="debug">, or similar, for it.)

It might also be that a check within the sch:rule never fails anyway, maybe because the checked condition also fails earlier structural validation so the Schematron never sees those documents or because there's an error in the XPaths used in the sch:rule.

So there might be a place for both (though still not seeing the need for a lot of extra machinery for programming language-style debugging assertions).

A common case I've come across is a runtime error where an atomic value was expected by a function, but a sequence was passed instead. This can occur also e.g. in message construction, with <value-of/>. Would we want assertions in such places too?

I think it would be good to refine the expected behaviour and prospective reporting of errors, if this is to be standardised.

True, although the other approach is to let implementers try things and then standardise what succeeds.

I'd be interested in input from the wider community about this as a feature. XML Prague and the Schematron Users Meetup are around the corner, which is one suitable forum.

Indeed.

rjelliffe commented 3 months ago

On Thu, 16 May 2024, 08:56 Andrew Sales, @.***> wrote:

But I still don't understand Andrew's point, sorry, unless he is saying a developer using this may not cover all cases, or be a matter of discipline: that's life, isn't it?

I mean that the perceived utility of an assert will run out quickly for all but the most simple cases and most predictable input.

Sure. Or it may hit someone's sweetspot. If someone does not want to use them, they don't have to.

In my experience, potentially complicated input require complicated XPaths. So developers leave out cases they expect will never occur. A way to make the subset they accept explicit could help maintenance, and prevent the variable and context XPaths from being obfuscated with terms that are not expected .

Moreover, the way to uncomplicate XPaths is to use chains of variables: so only having a @assert or @post-condition would encourage more chains. For example, a style guide could require that, unless impossible, all divisions should be done in a variable so that exceptiins are properly caught and handled.

A real example from just the other day. I was testing out a new rule which

worked in isolation but threw a divide-by-zero error when incorporated into the target schema. The cause was my test cases were toys that omitted otherwise required structures. The IDE was able to take me to the point of failure in the XSLT for the compiled schema.

Would an assert have helped me? Possibly, but I found the cause from my IDE anyway. Would I have wanted to put one everywhere in a sizeable schema where division was used? Probably not. If real-world input had caused this, I'd've added an extra condition to the relevant XPath and moved on. Here, I adjusted my test cases and moved on.

The decision to put in redundant checks, like assertions or post-conditions, would not be based on "would I have wanted to": no-one ever wants to do anything :-) It would be based on risk considerations: the more that some Schematron schema processes high-value high-risk information, or requires diagnosis by ops teams apart from an IDE, the more that adding redundant checks is appropriate.

I had a real example last week too: the DTD allows multiple /document/fragment/properties-section/properties but none of the documents have more than one fragments with properties-section with the same name. But I smelled a rat. I would have liked to have excluded the case of multiple properties with the same name without palava.

e.g. <sch:rule context="property" expect="not(preceding-sibling::property[@name() = current()/@name])" >...

As I mentioned, another way to view this issue is as one of addressibility: how do we make sure that messages go to the person or workflow or log that can deal with them. I think adressability (e.g by email or message handling) is a feature of many pipeline/message systems but Schematron does not provide an integration point with them.

Static type checking of signatures within Xpaths, like OxygenXml does, is a

different issue, I think. That is where @as https://github.com/as connects better.

Not static: dynamic. Using string() or concat() for example, where an argument is a sequence because there is unexpectedly more than one of something that needs reporting in the message generated.

I think you are agreeing with me. Static is a different issue.

To an extent, yes.

It would critically important to be clear how this feature would affect validity, if it all. I'm not asking for all of this information here and now, I am just noting the need.

I think I said: not at all. It can be turned on and as deemed necessary without affecting validity.

At user option, it could be implemented so that exceptions are caught, logged and the validation proceed.

Rick

rjelliffe commented 3 months ago

There may be another for the QLBs here: define what happens when there is an error. In my view, the best choice would be that a pattern that generates an error is aborted, by default. But other patterns are not affected. The SVRL would have some element to signal the pattern crashed. The definition for simple validity would be "unable to be validated."

Rick

On Thu, 16 May 2024, 13:54 Rick Jelliffe, @.***> wrote:

On Thu, 16 May 2024, 08:56 Andrew Sales, @.***> wrote:

But I still don't understand Andrew's point, sorry, unless he is saying a developer using this may not cover all cases, or be a matter of discipline: that's life, isn't it?

I mean that the perceived utility of an assert will run out quickly for all but the most simple cases and most predictable input.

Sure. Or it may hit someone's sweetspot. If someone does not want to use them, they don't have to.

In my experience, potentially complicated input require complicated XPaths. So developers leave out cases they expect will never occur. A way to make the subset they accept explicit could help maintenance, and prevent the variable and context XPaths from being obfuscated with terms that are not expected .

Moreover, the way to uncomplicate XPaths is to use chains of variables: so only having a @assert or @post-condition would encourage more chains. For example, a style guide could require that, unless impossible, all divisions should be done in a variable so that exceptiins are properly caught and handled.

A real example from just the other day. I was testing out a new rule which

worked in isolation but threw a divide-by-zero error when incorporated into the target schema. The cause was my test cases were toys that omitted otherwise required structures. The IDE was able to take me to the point of failure in the XSLT for the compiled schema.

Would an assert have helped me? Possibly, but I found the cause from my IDE anyway. Would I have wanted to put one everywhere in a sizeable schema where division was used? Probably not. If real-world input had caused this, I'd've added an extra condition to the relevant XPath and moved on. Here, I adjusted my test cases and moved on.

The decision to put in redundant checks, like assertions or post-conditions, would not be based on "would I have wanted to": no-one ever wants to do anything :-) It would be based on risk considerations: the more that some Schematron schema processes high-value high-risk information, or requires diagnosis by ops teams apart from an IDE, the more that adding redundant checks is appropriate.

I had a real example last week too: the DTD allows multiple /document/fragment/properties-section/properties but none of the documents have more than one fragments with properties-section with the same name. But I smelled a rat. I would have liked to have excluded the case of multiple properties with the same name without palava.

e.g. < sch:rule context="property"

@. = @.)">

As I mentioned, another way to view this issue is as one of addressibility @.***, Brutus?): how do we make sure that messages go to the person or workflow or log that can deal with them. I think adressabilty (e.g by email or message handling) is a feature of many pipeline/message systems but Schematron does not provide an integration point with them.

Static type checking of signatures within Xpaths, like OxygenXml does, is

a different issue, I think. That is where @as https://github.com/as connects better.

Not static: dynamic. Using string() or concat() for example, where an argument is a sequence because there is unexpectedly more than one of something that needs reporting in the message generated.

I think you are agreeing with me. Static is a different issue.

To an extent, yes.

It would critically important to be clear how this feature would affect validity, if it all. I'm not asking for all of this information here and now, I am just noting the need.

I think I said: not at all. It can be turned on and as deemed necessary without affecting validity.

At user option, it could be implemented so that exceptions are caught, logged and the validation proceed.

Rick

rjelliffe commented 3 months ago

On Thu, 16 May 2024, 13:13 Tony Graham, @.***> wrote:

I suggest that using 'assertion' as an unqualified term here risks confusion with sch:assert (at least for me).

Yes. Maybe I should have left it as "post-condition".

(I'm not sure how well the reasonable return value idiom translates to Schematron, where there aren't visible return values as such, just the presence or absence of messages from sch:assert and sch:report.)

But sch:let, sch:rule/@context,sch: param/@value (and sch:value-of, and sch:pattern/@documents) do have values that coud be tested.

It seems to me that @rjelliffe https://github.com/rjelliffe wants the Schematron to (also) be able to deliver the rude shocks, while @AndrewSales https://github.com/AndrewSales would (mostly) leave it to the unit tests. Plus, I think there's general agreement that there will always be one more bug when some user somewhere tries something unexpected. (Antenna House Formatter once had a bug with Latin superscripts in Bulgarian text. Who could have predicted that?)

Many users are not comfortable with XPaths. They are not confident that, for example, when they split a complex Xpath into a chain of variables, that they know where or if a mistake has occurred. The "if" is particularly unsettling: has there been no invalidity because an @context never fired? (Part of this can addressed by counting rule firings in the SVRL, of course, if that is enabled.)

It might be that myriad unit tests could all fail to exercise something that could be caught by checking a value that is calculated within the sch:rule. (At this point I don't know why you would do anything other than , or similar, for it.)

Yes, it may be role=debug is enough, if implementors provide a way to enable and disable them.

It might also be that a check within the sch:rule never fails anyway, maybe

because the checked condition also fails earlier structural validation so the Schematron never sees those documents or because there's an error in the XPaths used in the sch:rule.

So there might be a place for both (though still not seeing the need for a lot of extra machinery for programming language-style debugging assertions).

I think the @assert or @post-condition can be quite simple to implement.

I am probably swinging around to allowing sch:expect elements with the same form as sch:assert, under e.g, let, rule and probably param. And with a @to attribute for addressing.

Rick

Message ID: @.*** com>

AndrewSales commented 6 days ago

Removing the 2025 label. @rjelliffe , if you could see your way to replacing all the instances here of ***@***.*** with what you intended before the next edition is in preparation, it would be appreciated.