Open Arithmeticus opened 9 months ago
(a) Yes I think it would be useful.
(b) I think my preference would be to use fn:serialize
with method="canonical-xml"
. An alternative is to use method="xml" canonical="yes"
, but this has the disadvantage that there are many interactions with other serialization options, e.g. indent
, cdata-section-elements
, and omit-xml-declaration
.
Note, if you want to experiment, Saxon already offers `method="xml" saxon:canonical="yes": see https://www.saxonica.com/documentation12/index.html#!extensions/output-extras/serialization-parameters. I tested this against the canonicalizer offered by XOM.
(c) There are certainly users who would want XML Signature for document signing, rather than just canonicalisation.
I also think this would be useful.
(b) I think my preference would be to use
fn:serialize
withmethod="canonical-xml"
. An alternative is to usemethod="xml" canonical="yes"
, but this has the disadvantage that there are many interactions with other serialization options, e.g.indent
,cdata-section-elements
, andomit-xml-declaration
.
+1 for adding a custom method, and we should raise an error if the input is not a single node.
In thinking about discussion that might happen on this issue at today's CG meeting, I noted to myself:
fn:serialize
express parameters one might like to adjust in canonical serialization.fn:deep-equal
express even more parameters one might like to adjust in canonical serialization, including namespace prefixes and characters.undeclare-prefixes
versus namespace-prefixes
.namespace-prefixes
in the map options for fn:deep-equal
runs against CX1.1's resolution.Ideas for a way forward to avoid repetition in the specs:
fn:deep-equal
and fn:serialize
that should apply to canonical serialization. fn:deep-equal
and fn:serialize
conflict, let the latter prevail (in XQFO 3.1 fn:deep-equal
lacks an options map).I am uncertain what edits might be needed to the serialization specs, particularly for those options in fn:deep-equal
that have no fn:serialize
counterpart, e.g., schema-aware adjustments, processing-instructions. It may turn out we wish not to include some of these in a common options map. Let's discuss.
I would be happy with either:
method=xml-c14n
method=xml canonical=yes
Regards whether to add canonical
or not, I guess we should ask if this would be relevant to other methods such as HTML, JSON, or perhaps any future method we might envisage (Yaml, or CSV anyone?)
This issue picks up suggestions from #779 regarding canonical serialization, and solicits from the community group input on if such a function is desirable, and what such a function might look like.
In the context of #779, the idea was that two XML documents with different physical representations, but semantically equivalent, could be serialized to a canonical form, with a hash value applied to each confirming identity. Of course, with canonical operation, a simple string comparison would be sufficient, absent any hashing.
XML Signature was suggested as one approach, with some hesitation. I would like to suggest, instead, that we look to implement Canonical XML Version 1.1 (herein CX1.1), perhaps with map options that calibrate how CX1.1 is implemented. I have no experience using CX1.1, so user input is welcome.
Another point of discussion is whether this merits a new function, e.g.,
fn:canonical-serialize
, or should be built uponfn:serialize
. A problem with the latter option, is that such an approach makes no sense without themethod
option specified asxml
. Another approach would be to go deeper, into the serialization spec, and expand thexml
method to ensure a canonical option.I believe that this function would be extremely useful. When preparing test suites, output could be saved as secondary documents as canonical XML, and any subsequent regression tests could adjust comparanda to canonical XML, and very precise node-wise comparisons could be made.
I look forward to everyone's input.