Closed AndrewSales closed 8 months ago
The subject is conceptually and operationally distinct from the context.
Yes, in trivial cases, which are most cases, they will be the same. When you don't need the subject, don't use it: it is a zero-cost abstraction for users.
Take this example: ` < sch:rule context="table/row"> < sch:assert test="following-sibling::row or not(following-sibling::*)" subject="parent::table"
A table should only contain rows after the first row. < /sch:assert> < /sch:rule ` In this case, the assertion is stated in terms of tables. But the implementation in Xpath was done, for convenience, using rows as the context. So the @.*** attribute is provided to allow the implementation to return (in the SVRL location) an object that matches the text. This is because, in all cases, the text comes first in Schematron; or, at least, we want the deviser of the schema to be able to decide the text that is most meaningful to the users, to decide the object being located in the SVRL that is most meaningful for the user's systems, and to decide the XPath to implement those two things in a way that is most convenient for the developer.
Consider the well-known flaw of DTDs (and XSD and RELAX NG) that a broken content model is reported with information about where the brokenness was detected, not necessarily at the point that the problem actually occurred.
Now without the @subject we force the developer to write this:
< sch:rule context="table"> < sch:assert test="count(*[self::row or preceding-sibling::row]) = count(row)"> A table should only contain rows after the first row. ...
Now that is tolerable, if the developer was lucky enough to have done it in
the first place. But it adds an extra burden on them.
Lets contrast this with DTDs: in DTDs (and XSD and RELAX NG) the point where an error is detected may bear no resemblance to the point where the error occurred. Useless and frustrating for users. This uselessness is one of the long-running problems with grammar-based validation.
For example, take a content model for picture that says (thumbnail,
para+) | (para+, figure). The DTD validator will, if it finds a sequence [
thumbnail, para, figure], complain that the figure is unexpected. But what
if our assertion is this:
< sch:rule context="picture"> < sch:report test="figure and thumbnail"> A thumbnail is not needed when a picture has a figure. < /sch:report> < /sch:rule>
In this case, the XPaths are very simple. But the rug does not match the
curtain. So just as the DTD would fail providing the bad location of the
figure, so that assertion would fail providing the bad location of the
picture. By adding to the sch:report object="thumbnail" the SVRL can
clearly point to the object that the developer wants the SVRL to locate,
without having to recode all the other XPaths in potentially complex ways.
The larger, deeper and more complex a document is, and the longer the rules are and the more complex the assertions are, the more chance there is that the @.*** is not information that is directly useful as a location in the SVRL. For example, take this:
< sch:rule context="endnote"> < sch:p>Here are all the constraints on endnotes< /sch:p> ... < sch:report test=".//figure[not(caption)][1]" subject="(.//figure[not(caption)])[1]" > In an endnote, all figures in a chapter should have a caption. < /sch:report> < /sch:rule>
So in this case, the schema developer has chosen that they only want to
report the first of this error (which, for example, writers of
gateway/firewall validators do to avoid unnecessary tests), and they want
to group all the assertions relating to end-notes together into one rule
(for whatever reason: their choice.) But they want the SVRL to locate the
offending figure directly, not just be told that there is something
somewhere awry in the end-note.
I have found @subject only useful in quite complex schemas (from memory where I was validating an input document against an output document following a complex transformation that did a lot of re-structuring), but where it was useful it was very useful indeed.
Regards Rick
On Sat, Apr 22, 2023 at 11:17 PM Andrew Sales @.***> wrote:
We do not understand why we need both "rule context" and "subject". Are there any subjects that are not rule contexts? (See comments on 5.5.14.) If so, we propose to replace "subject" by "node" (a term from XPath).
— Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron-enhancement-proposals/issues/53, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF65KKKFZS47BA3CL4DSG3DXCPK65ANCNFSM6AAAAAAXH2ZCJY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Added a note to the definition of subject
referring to the subject
attribute and its usage.
We do not understand why we need both "rule context" and "subject". Are there any subjects that are not rule contexts? (See comments on 5.5.14.) If so, we propose to replace "subject" by "node" (a term from XPath).