XML-Security / signxml

Python XML Signature and XAdES library
https://xml-security.github.io/signxml/
Apache License 2.0
137 stars 109 forks source link

Unable to sign/verify signed SOAP documents #133

Open Helbrax opened 5 years ago

Helbrax commented 5 years ago

We have a java and c# application that signs xml documents for inter process communication, as well as sending the documents outside of our network to a 3rd party vendor. I am trying to reproduce the same functionality in python, with limited success.

Test xml

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <tns:Body id="Body" xmlns:tns="http://schemas.xmlsoap.org/soap/envelope/">test</tns:Body>
</soap:Envelope>

Both the java/net version code behaves in a similar manner(pseudo code below):

xml = loadxml("file.xml")
soap-header =  new element("Header", "http://schemas.xmlsoap.org/soap/envelope/")
soap-signature = new element("Signature", "http://schemas.xmlsoap.org/soap/security/2000-12")
soap-header.append(soap-signature)
xml.insert(0, soap-header)
signer = new signer(soap-header) #IMPORTANT
signer.addreference("#Body")
signer.canonmethod = "http://www.w3.org/TR/2001/REC-xml-c14n-20010315"
signer.digest = SHA1
signer.signature = RSA-SHA1
signature = signer.sign(x509cert, password)
soap-signature.append(signature)

The document now looks like(without the body/envelop tag):

        <SOAP-SEC:Signature>
            <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
                <ds:SignedInfo>
                    <ds:CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" />
                    <ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1" />
                    <ds:Reference URI="#Body">
                        <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
                        <ds:DigestValue>digest is here</ds:DigestValue>
                    </ds:Reference>
                </ds:SignedInfo>
                <ds:SignatureValue>base64 sig here</ds:SignatureValue>
                <ds:KeyInfo>
                    <ds:X509Data>
                        <ds:X509IssuerSerial>
                            <ds:X509IssuerName>cert name</ds:X509IssuerName>
                            <ds:X509SerialNumber>cert serial</ds:X509SerialNumber>
                        </ds:X509IssuerSerial>
                    </ds:X509Data>
                </ds:KeyInfo>
            </ds:Signature>
        </SOAP-SEC:Signature>
    </SOAP:Header>

A soap header/signature is added to the document, the body is signed, and the resulting signature is added to the document. Java/c# and our vendors system all agree on the signatures and all validate successfully. Nothing I sign in python will validate in c#/java, and nothing I sign in c#/java will validate in python.

I think the core of the issue is the signer in c#/java takes a signing context(the soap header instead of the root of the document). Doing this new signer(soap-header) vs new signer(xml) changes both the body digest and signature, as the canonical form changes do to the context scoping. I am unclear if this is vendor specific, or if this is a requirement of the SOAP security extension, as it relates to this line in the specification:

Note that XML Canonicalization [XML-C14N] of and other signed resources MUST each be done within its own context. This means, among other things, that the Canonical form [XML-C14N] of always inherits the namespace declarations for SOAP-ENV and SOAP-SEC

Unless I am just missing it or have a fundamental misunderstanding of the spec(probably the latter), I need to provide a context to the signature and I see know way to do that in the library.

Here is my python code:

cert_bytes = open("cert.pfx", "rb").read()
(private_key, certificate, additional_certificates) = pkcs12.load_key_and_certificates(data=cert_bytes, password="password".encode(), backend=default_backend())
data = open("file.xml"", encoding='UTF-8').read()
root = ElementTree.fromstring(data)
soap_header = ElementTree.Element("SOAP:Header", attrib={"xmlns:SOAP": "http://schemas.xmlsoap.org/soap/envelope/", "xmlns:SOAP-SEC": "http://schemas.xmlsoap.org/soap/security/2000-12"})
soap_security = ElementTree.SubElement(soap_header, "SOAP-SEC:Signature")
root.insert(0, soap_header)
body = root.find("{http://schemas.xmlsoap.org/soap/envelope/}Body")
signer = signxml.XMLSigner(method=signxml.methods.detached,
                           signature_algorithm='rsa-sha1',
                           digest_algorithm='sha1',
                           c14n_algorithm='http://www.w3.org/TR/2001/REC-xml-c14n-20010315')
signed_info = signer.sign(data=body,
                          key=private_key,
                          cert=[certificate],
                          reference_uri="#Body")
signed_info_xml = (ElementTree.tostring(signed_info).decode())
signature = ElementTree.fromstring(signed_info_xml)
soap_security.append(signature)
ElementTree.ElementTree(root).write(file_or_filename="signed.xml", encoding="utf-8", method="xml")
signed_document = ElementTree.fromstring(open("signed.xml").read())
signature = signed_document.find('{http://www.w3.org/2000/09/xmldsig#}Signature')
verify_cert = open("publickey.pem").read()
verified_data = signxml.XMLVerifier().verify(x509_cert=verify_cert, data=signature).signed_xml
kislyuk commented 5 years ago

Is it possible to extract the canonicalization (string to sign) from the C# or Java implementation? I'd like to help, but I can't really help you without having a test case here, including an example document, cert, key, and canonicalization/string to sign.

Helbrax commented 5 years ago

The test.xml example above is what i was using as my test case. Any x509 cert should work, as it doesn't appear to be an issue related to the certs themselves, since java/c#/vendor all agree on the signature. I'll attempt to slip into the code in c# to see if i can see how the canonical body looks. Unfortunately I can only view the java code and can't edit it.

Helbrax commented 5 years ago

Using the text xml provided, this is what i get from the C# debug log:

System.Security.Cryptography.Xml.SignedXml Information: 3 : [SignedXml#02aeb54d, BeginSignatureComputation] Beginning signature computation.
System.Security.Cryptography.Xml.SignedXml Verbose: 3 : [SignedXml#02aeb54d, BeginSignatureComputation] Using context: <SOAP:Header xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-SEC="http://schemas.xmlsoap.org/soap/security/2000-12"><SOAP-SEC:Signature /></SOAP:Header>
System.Security.Cryptography.Xml.SignedXml Verbose: 11 : [SignedXml#02aeb54d, SigningReference] Hashing reference Reference#00245fb7, Uri "#Body", Id "", Type "" with hash algorithm "http://www.w3.org/2000/09/xmldsig#sha1" (SHA1CryptoServiceProvider).
System.Security.Cryptography.Xml.SignedXml Verbose: 8 : [Reference#00245fb7, ReferenceData] Transformed reference contents: <tns:Body xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://schemas.xmlsoap.org/soap/envelope/" id="Body">test</tns:Body>
System.Security.Cryptography.Xml.SignedXml Information: 7 : [SignedXml#02aeb54d, NamespacePropagation] Propagating namespace xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/".
System.Security.Cryptography.Xml.SignedXml Information: 7 : [SignedXml#02aeb54d, NamespacePropagation] Propagating namespace xmlns:SOAP-SEC="http://schemas.xmlsoap.org/soap/security/2000-12".
System.Security.Cryptography.Xml.SignedXml Information: 7 : [SignedXml#02aeb54d, NamespacePropagation] Propagating namespace xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/".
System.Security.Cryptography.Xml.SignedXml Information: 0 : [SignedXml#02aeb54d, BeginCanonicalization] Beginning canonicalization using "http://www.w3.org/TR/2001/REC-xml-c14n-20010315#WithComments" (XmlDsigC14NWithCommentsTransform).
System.Security.Cryptography.Xml.SignedXml Verbose: 0 : [SignedXml#02aeb54d, BeginCanonicalization] Canonicalization transform is using resolver System.Xml.XmlSecureResolver and base URI "".
System.Security.Cryptography.Xml.SignedXml Verbose: 5 : [SignedXml#02aeb54d, CanonicalizedData] Output of canonicalization transform: <SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#" xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-SEC="http://schemas.xmlsoap.org/soap/security/2000-12" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315#WithComments"></CanonicalizationMethod><SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"></SignatureMethod><Reference URI="#Body"><DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></DigestMethod><DigestValue>S69aGcA6LmGyg7mw//69y07fUSg=</DigestValue></Reference></SignedInfo>

If i run this in python, I get the same digest as c#:

foo = """<tns:Body xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://schemas.xmlsoap.org/soap/envelope/" id="Body">test</tns:Body>"""
body_hasher = sha1()
body_hasher.update(foo.encode('UTF-8'))
digest = b64encode(body_hasher.digest()).decode()

print(digest)

If i run this code in python i get the same signature value as c#:

foo = """<SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#" xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-SEC="http://schemas.xmlsoap.org/soap/security/2000-12" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315#WithComments"></CanonicalizationMethod><SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"></SignatureMethod><Reference URI="#Body"><DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></DigestMethod><DigestValue>S69aGcA6LmGyg7mw//69y07fUSg=</DigestValue></Reference></SignedInfo>"""
cert_file = open("cert.pfx", "rb")
cert_bytes = cert_file.read()
cert_file.close()
(private_key, certificate, additional_certificates) = pkcs12.load_key_and_certificates(data=cert_bytes, password="supersecretpassword".encode(), backend=default_backend())
sig = private_key.sign(data=foo.encode('UTF-8'), padding=padding.PKCS1v15(), algorithm=hashes.SHA1())

print(b64encode(sig))
Helbrax commented 5 years ago

I think the issue has to do with namespace propagation. I may be inspecting the wrong part of the code in the signxml package, but it looks the duplicate namespaces are not being included in the python version of the canonicaliztion and are in the .net/java version.

<ns0:Body xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/" id="Body">test</ns0:Body>

vs

<tns:Body xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://schemas.xmlsoap.org/soap/envelope/" id="Body">test</tns:Body>

The Envelope root node of the document has a namespace prefix for SOAP, and the body shares the same namespace but under a different prefix. In .net/java, both are propagated down. In python, only one is. I can see in the specification that

Superfluous namespace declarations are removed from each element.

However, as far as I can tell, this only applies to default namespaces or namespaces that share the same prefixes. It appears that same namespaces with differing prefixes are not considered superflous.

nitharios commented 4 years ago

@Helbrax were you able to come up with a workaround for this? We believe this is the issue we are running into now.

Helbrax commented 4 years ago

@nitharios Unfortunately no. After my last comment I moved on to other things while waiting for a response. After a few days, the decision was made to just use the c# code we had(we are running it in AWS lambdas) since we knew it worked and we had it now. I filed it away to come back to out of professional curiosity but "life and stuff" happened and I haven't had a chance to look at it at all.

kislyuk commented 4 years ago

I finally got around to digging into this and finding the root cause for your issue.

The issue you ran into - namespace prefixes being dropped from the canonicalized serialization of the signed node - is the reason "InclusiveNamespaces PrefixList" (https://www.w3.org/TR/xml-exc-c14n/#def-InclusiveNamespaces-PrefixList) is in the canonicalization spec. For namespace prefixes that are not in use in the subtree to be propagated in the subtree, the prefixes must be supplied to the canonicalization method as a parameter.

It's not clear from your description whether the C# software you use obeys this, or if it just implicitly adds prefixes without setting the <ec:InclusiveNamespaces PrefixList="" xmlns:ec="http://www.w3.org/2001/10/xml-exc-c14n#"/> node to the transforms node as described in the spec. Since you said signxml had trouble verifying the signature generated by your C# software, I suspect the latter.

SignXML currently only supports parsing "InclusiveNamespaces PrefixList" when verifying, not when signing. It will be easy for me to add another option to sign() to be able to pass in inclusive_ns_prefixes, but what will be hard is making the option discoverable when it's needed.

Helbrax commented 2 years ago

So i've gotten back around to playing with this.

Using the most up to date version of the module and lxml, I have made some progress but still having issues getting it to work properly.

Any document I sign in C#/java can now be verified correctly by signxml and the digest for the Reference element(#Body) is accurate in any document signed by c#/java and python.

However, if I sign the same document in python it wont validate in c#/java OR python.

Looking at the canonical form of the signature element that is generated when signing, c#/java it is different than the canonical when signing in python.

I've tried using the signature_inclusive_ns_prefixlist, but I think there is a bug in that. The results are "S O A P" instead of "SOAP" when looking at the outputted prefix list. This doesn't happen for payload_inclusive_ns_prefixlist. Looking at the code, both do " ".join(prefixes) but payload is a list of prefixes["soap", "foo", "bar"] but signature is a string("soap foo bar").

Example xml to sign `

test data ` Python signature(during canonical) ` 4SM335FSsimtsoJsYINUFUjWvDQ= ` c# signature(during canonical) ` 4SM335FSsimtsoJsYINUFUjWvDQ= ` When signed.xml is generated by c#/java the following succeeds. However if signed.xml is generated by python, it fails. `verified_data = XMLVerifier().verify(x509_cert=base64_cert, data=open("signed.xml", encoding='UTF-8').read()).signed_xml` Code to generate and validate sig in python `cert_bytes = open("test.pfx", "rb").read() (private_key, certificate, additional_certificates) = pkcs12.load_key_and_certificates(data=cert_bytes, password="testpassword".encode(), backend=default_backend()) data = open("file.xml", encoding='UTF-8').read() root = etree.fromstring(data) signer = signxml.XMLSigner(method=m.detached, signature_algorithm='rsa-sha1', digest_algorithm='sha1', c14n_algorithm='http://www.w3.org/TR/2001/REC-xml-c14n-20010315#WithComments') signed_info = signer.sign(data=root, key=private_key, cert=[certificate], reference_uri="#Body") soap_security_tag = root.find('{http://schemas.xmlsoap.org/soap/envelope/}Header/{http://schemas.xmlsoap.org/soap/security/2000-12}Signature') soap_security_tag.append(signed_info) etree.ElementTree(root).write("signed.xml") verified_data = signxml.XMLVerifier().verify(x509_cert=base64_cert, data=open("signed.xml", encoding='UTF-8').read()).signed_xml print(verified_data)`
kislyuk commented 2 years ago

Yeah, the problem that you're facing here is that you're generating a detached signature, then manually attaching it in the middle of your document. This is fine if you're using exclusive canonicalization, but it's incompatible with the default canonicalization method (http://www.w3.org/2006/12/xml-c14n11) as well as the one you specified (http://www.w3.org/TR/2001/REC-xml-c14n-20010315#WithComments). This is by no means trivial to diagnose, so I'll add an error in that codepath as it's bound to result in an invalid signature.

The following code works for me with your test payload:

import signxml
from lxml import etree
data = open("issue133/payload.xml", encoding='UTF-8').read()
root = etree.fromstring(data)
signer = signxml.XMLSigner(method=signxml.methods.detached,
                           signature_algorithm='rsa-sha1',
                           digest_algorithm='sha1',
                           c14n_algorithm='http://www.w3.org/2001/10/xml-exc-c14n#WithComments')
with open("issue133/example.key", "rb") as fh:
    private_key = fh.read()
with open("issue133/example.pem", "rb") as fh:
    certificate = fh.read()
signed_info = signer.sign(data=root,
                          key=private_key,
                          cert=[certificate],
                          reference_uri="#Body")
soap_security_tag = root.find('{http://schemas.xmlsoap.org/soap/envelope/}Header/{http://schemas.xmlsoap.org/soap/security/2000-12}Signature')
soap_security_tag.append(signed_info)
etree.ElementTree(root).write("issue133/signed.xml")
with open("issue133/signed.xml") as fh:
    verified_data = signxml.XMLVerifier().verify(x509_cert=certificate,
                                                 data=fh.read(),
                                                 validate_schema=False).signed_xml
print(verified_data)