Open lweberk opened 7 years ago
@lweberk sorry for the late response; what is c14n? I don't think I heard about anything like this with regard to XML.
Canonicalization. A canonical representation in memory necessary for things like digital signature and verification of XML's.
https://www.w3.org/TR/xml-c14n/ (c14n) https://www.w3.org/TR/xml-exc-c14n/ (c14n-exc) https://www.w3.org/TR/xmldsig-core/ (dsig)
Looks interesting, but I'm afraid this is a bit out of scope of the XML parser. I may consider supporting it (although it is not clear to me yet from skimming these specs whether it really requires support from the parser or if it could be done as postprocessing), but definitely not in the nearest future.
Agreed. It should be post-processing maybe in another crate hooked in as an optional feature. I can imagine its a question on taking the inner representation and generating a signable/verifiable buffer to operate on. As such it might be possible to stay non-invasive all the way through, creating only a one way dependency on xml-rs.
I'd like to take a dig at it, but I'm also a little afraid of the scope of the task and my lack of experience in both XML and Rust as well as my current time constraints.
I'll try to put time into this, no promises on when and whether it will birth something. So if anyone wants to take this up please drop a line in this issue to query on the status and to potentially coordinate efforts.
Was any progress made on this?
Unfortunately not, sorry.
Cool. We might produce funding for this eventually, but not sure yet.
For now, it seems rust-libxml provides c14n:
https://github.com/KWARC/rust-libxml
As in src/bindings.rs:
pub fn xmlC14NExecute
Those are generated bindings from the libxml2 headers. I'm just about to build bindings for xmlsec, which will require proper wrapping of canonicalization in rust-libxml. I'll post progress here once it hits upstream .
Sorry for the delay in response - unfortunately for quite a long time I don't have much capacity to work on personal projects :(
My understanding is that canonicalization is computing some well-defined representative of an equivalency class of a particular XML document, so basically it can be implemented as an option on an XML writer mechanism which ensures that the produced result is "canonical".
This does sound like a feature for this library, but unfortunately, as I said, I don't really have capacity to work on it right now. Whatever little time I have to work on xml-rs, I want to invest first into reimplementing the parser for it to be more performant and more in line with the modern Rust API practices (the parser-rearchitecture branch). But if someone is willing to contribute an implementation of c14n here, I will be very glad to review and accept it.
Bad news, but good for me. Xmlsec handles the buffer creation and canonicalization into it opaquely. Which means I can avoid having to wrap c14n directly in libxml. For those depending on it directly, my apologies.
@lweberk it seems discussion should, therefore, continue in an xmlsec wrapping project in rust. Kindly provide the link so we can step over there? I have some questions regarding wrapping xmlsec, mostly that it has too many dependencies which might make the resulting rust less portable. But let's discuss that in a separate project.
@netvl to put the c14n in context, I think most people use it to verify xmldsig (XML Digital Signature).
In that context, the digest of the XML has to be done over canonicalised form. It doesn't, however, require the XML files being 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 in that form, only that the digest being calculated in that form. That means the full canonicalised XML can exist in memory only, both when making and verifying of signatures. (If your digestor code is smart, it might not even need to exist in memory fully, only a part at a time.) In practice, canonicalised XML perhaps better not be written anyway since people dislike tools messing up with their way of formatting XML. Therefore I'm not sure if it is best "an option on an XML writer mechanism".
Also, signatures are verified more often than created. For example, a music app might want to verify that the album information it received is signed properly from the publisher, without ever signing an album itself. Hence it makes a good deal of sense to have this feature in an XML library designed for read-only situations.
@colourful-land ~I'm currently writing tests, examples and debugging/leaktesting the wrapper. I pushed a skeleton in the meantime so we have a place for discussion. Code will follow within a week once some degree of workability is guaranteed. If you want to have a look at the WIP, open an issue over there.~ Link: xmlsec
@netvl I'll leave it up to you to leave the issue open or close for indefinite deferral. For me it is closed, since we are falling back to libxml2/xmlsec.
I have a case where there is no libxml available. For reasonable XML that you would want to sign, c14n is not that difficult (e.g. just remove all the parts that make C14N a headache; strongly opinionated comment, of course).
In my case, the XML I want to sign is only missing the correct attribute order to be canonicalized (with the default C14N parameters). And the attribute order is handled by this library. Could we implement some version of e.g. this for starters?
EDIT: Also, allow to pass an option so that empty elements be serialized as start-end tags, i.e. <a></a>
instead of <a />
I have a case where there is no libxml available. For reasonable XML that you would want to sign, c14n is not that difficult (e.g. just remove all the parts that make C14N a headache; strongly opinionated comment, of course).
In my case, the XML I want to sign is only missing the correct attribute order to be canonicalized (with the default C14N parameters). And the attribute order is handled by this library. Could we implement some version of e.g. this for starters?
EDIT: Also, allow to pass an option so that empty elements be serialized as start-end tags, i.e.
<a></a>
instead of<a />
My understanding is that https://docs.rs/xml-rs/0.8.4/xml/struct.EmitterConfig.html#structfield.normalize_empty_elements should cover that last bit.
@yaleman thanks; your suggestion in my case does give the desired output (although it is not clear from the syntax if the input gets serialized as <a></a>
whenever it already has <a />
).
For the sorted attributes part, I am now using xmltree-rs with the "attribute-sorted" feature, which works for my use case.
I'm not planning to support canonicalization directly in the library. I hope it can be implemented by others just by using the public API. If this can't be done due to some missing features or config options, please file bugs with specific requests.
In case anyone else needs a quick&dirty, stupid simple canonicalization function, I wrote a safe wrapper for the libxml functions.
https://crates.io/crates/xml_c14n
Personally I don't use c14n, so I don't plan to work on it. However, I'm happy to receive PRs moving the library towards it.
Hi,
just wanted to check out whether there where plans and/or interest in this.
~lwk