Open ndw opened 9 years ago
The following declaration is for a generalized validation step. Much of what follows describes how this step applies to XML validation, but the actual validation performed is implementation defined. Validation of JSON data against json-schema would be entirely plausible.
<p:declare-step type="p:validate">
<p:input port="source" primary="true"
content-types="application/octet-stream"/>
<p:input port="schema" sequence="true"
content-types="application/octet-stream"/>
<p:input port="models" sequence="true"
content-types="application/xml */*+xml text/*"/>
<p:output port="result" primary="true" sequence="true"/>
<p:output port="report" sequence="true"/>
<p:output port="validation-attempted" sequence="true"/>
<p:option name="assert-valid" select="'true'" as="xs:boolean"/>
<p:option name="group" select="''" as="xs:string"/>
<p:option name="phase" select="''" as="xs:string"/>
<p:option name="version" as="xs:string"/>
<p:option name="parameters" as="map(xs:QName,item())"/>
</p:declare-step>
The semantics of the p:validate
step are that the source
document is validated in an implementation defined way. The schema
and models
ports exist only to provide suggestions to the implementation.
There are several possible outputs:
assert-valid
option was false
and validation failed, then the original document is returned on the result
port.result
port. In this case, the validation-attempted
port should document the validation or validations that were attempted.assert-valid
option is true, then nothing appears on the output port and an error is raised.ISSUE: In the case where this error is caught by p:catch
(how) can the validation-attempted
and report
steps be read?
The output on the report
step depends on the validation attempted. For Schematron validation, a report format is defined. For other kinds of validation, the report is implementation-defined.
Although the step is for generalized validation, it does have a couple of options designed to support a specific XML scenario: the XML Model Processing Instruction. In the absense of other information, implementations should use the XML Model PI to determine what kind of validation to perform on XML documents.
The group
and phase
options provide the corresponding values as discussed in the XML Model PI spec.
The models
input port and the validation-attempted
output port use XML documents to describe desired validation in the former case and validations attempted in the latter. The following c:model
element definition should be supported.
<c:model
href? = anyURI
type? = string
schematypens? = anyURI
charset? = string
title? = string
group? = string
phase? = string
/>
Additional variations on c:model
are allowed, as are entirely different vocabulary elements as appropriate.
When RELAX NG validation is selected, the following parameters should be recognized: dtd-attribute-values
, and dtd-id-idref-warnings
.
When XML Schema validation is selected, the following parameters should be recognized: use-location-hints
, try-namespaces
, and mode
.
If an NVDL schema appears on the models
port, NVDL validation should be attempted.
This was discussed at the 25 Feb 2015 meeting, http://www.w3.org/XML/XProc/2015/02/25-minutes (the issue, that is, not the proposal)
An output port with basic information about the validation when assert-valid="false" would be useful. Such as the total number of assertions, number of assertions failed, skipped, with warnings and succeeded. Currently a schematron validation succeeds if count(//svrl:failed-assert) + count(//svrl:successful-report) = 0
, and this XPath is different for other kinds of validations. Some metadata about the validation, when available, might also be useful to include in such a document, such as name (/sch:schema/sch:title
) and base URI of the source document. Maybe just something like:
<c:result name="Test Name" tests="18" skipped="7" errors="5" warnings="3"/>
From Gerrit Imsieke:
Others have already asked for unified report ports for the validation steps p:validate-with-relax-ng and p:validate-with-xml-schema. While we see that it might not be easy to change the signature of the existing standard library steps, here’s a fresh approach that also saves a lot of verbosity.
It builds upon the xml-model processing instruction that may be prepended to an XML document (http://www.w3.org/TR/xml-model/).
We could either add a step p:validate-according-to-xml-models that executes each validation and creates a sequence of c:errors and svrl:schematron-output documents on the report port.
But as we strive for terseness of expression, we may add an attribute use-xml-models="assert-valid|report-only|none" to input and output ports. (p:input: both declarations and connections, where the attribute value on connections has precedence).
If the attribute is on a step’s input port and its value is 'report-only', it will add a port 'report' (sequence=true) to the readable ports within the step. Alternatively, the port could be named 'error', to avoid an additional name for ports that magically spring into existence.
If the attribute value is 'assert-valid' and if the step is within a p:try/p:group, it will add these report documents to the error port of subsequent p:catch instructions.
This will greatly reduce verbosity by eliminating the need to spell out input/output validation steps explicitly. It is syntactic sugar that may be expanded to long-form explicit validation instructions (by means of XSLT transformation, for example).
If there are no xml-model PIs, no validation will occur.
xml-model-based validation should support Relax NG, Relax NG compact syntax, XSD in different versions, ISO Schematron, NVDL, and DTD.
Because prepending xml-model PIs to documents is a bit cumbersome, there should be an optional step p:prepend-xml-model like this: