Open fumieval opened 10 months ago
~My first thought (without having looked into the details) might be that it's using some hashing or encryption method that this library hasn't implemented. If so, that should be easy to add since it would just be a case of detecting it and actioning it.~
Ah, hmm, that doesn't seem to be the case. It does look more like the problem is somewhere in how we process the XML. If I take the base64-decoded XML, manually remove the signature, run it through canonicalisation, and then calculate the hash, I end up with 8e5dafc247099a78df0322f84b23c4492e92756d2f20a2229837ce5e7b437dbf
as contained within the signature.
If we diff the canonicalised XML we can clearly see what the difference is:
diff --git a/man-canon.xml b/lib-canon.xml
index 0d66c57..8203f66 100644
--- a/man-canon.xml
+++ b/lib-canon.xml
@@ -28,10 +28,10 @@
<ds:SignatureValue>
L48AXgwVFB/w7DyMdNsrLeLobBQYr95FWYoAHwT45cymHm8axBvzS2jHRBCzsWI94choy/xCI+XiVqx+IQcv+8hOIhDGbU7xUFQSFxYkU5DeqSE3znSom3+8WgF29B2pINaRBFZmGzUXGHCA+eYY7rGhss2YewGNiDiHFfpMODjT8BhJBZ5qa+A+K1rkZPwlo7FxlOhG0Cp4IyFJCSP6y/sf2vKJWQPX4rexHUmwcoIPyZJIUj+4bzuM/zq+DV5T1nxXhbYOBAdkUAHB2vEqaqMrA7a27FwShcfhMhfWB7mxrXhCz9YSINR1NUCYxkekKuWHEnkDDEhcem5PFzB02g==</ds:SignatureValue>
<KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#">
- <ds:X509Data>
- <ds:X509Certificate>
- MIIDMzCCAhugAwIBAgIVANMNnkEBAZZT21ecmEs11XdA22uCMA0GCSqGSIb3DQEBCwUAMFMxCzAJBgNVBAYTAkpQMRwwGgYDVQQKDBNHTU8gR2xvYmFsU2lnbiBLLksuMRMwEQYDVQQLDApUcnVzdExvZ2luMREwDwYDVQQDDAhoZXJwLWluYzAeFw0yMzA4MTQwNTI3MzhaFw0zMzA4MTQwNTI3MzhaMFMxCzAJBgNVBAYTAkpQMRwwGgYDVQQKDBNHTU8gR2xvYmFsU2lnbiBLLksuMRMwEQYDVQQLDApUcnVzdExvZ2luMREwDwYDVQQDDAhoZXJwLWluYzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAL4PFYG7j0DKWgMkQ08Nz9z2f/WTFh6xUi3Pk5jHyIPQbwTW0TdPfXV8Gj5gl//nH+KMILS8WJHnY9RDUK0hkqUBAZLiX/HUDMcwOZ+8OPZSl/imtJTtoTEE6hBJEiT0sgtxS0bOI1TWdhnWiNMDO+Acp+K+0m8AGz9BgOyUT0TJN9wstyfpgASEK2Oy6VNhYfeCQ4QU47aZDhq/1Ei01wWLlIWZ4uZl9uzcROnKsmEmbkT+uAf5tcLG9PjR+XB58AiMvpufhvjsfnp9qVBTiTmgup9zbYfhzkNTKmr1Axi/FES5j7lfroWCSbr4dDw4S8wxTuUlF1v77PQ+/DFXVWMCAwEAATANBgkqhkiG9w0BAQsFAAOCAQEAc4EQUPj2jCUjOJZX4UGvvxu8jA+vjcC5armcZTSno+4AZ73JbZRU1NwxqzY+13W+m9DKN8bj6gRTHQeWptawrigABVROyNaYNeI+2wXcdf2eSqaS7pMSYKEzPAeSEk1ZPwh1/OvmmZLHhvSVaVxuou32x1NM6UaXIQlUUsS8x3NDETfTxDzSisusF5aCJxu9ONXCLUd3s75xn8zXPv1OluEJ1pcq6xHwaoDvwXV2l5eQdWvbZCulKQAy8dnUx0JOBFQg6UamxlKOKKNdOa6FU7M8wpZx0cNbCEME+KPthUFIIxHIvv0mOjPfJFzjCKicUHsVpdtQixe1L9gyLEtKTg==</ds:X509Certificate>
- </ds:X509Data>
+ <X509Data>
+ <X509Certificate>
+ MIIDMzCCAhugAwIBAgIVANMNnkEBAZZT21ecmEs11XdA22uCMA0GCSqGSIb3DQEBCwUAMFMxCzAJBgNVBAYTAkpQMRwwGgYDVQQKDBNHTU8gR2xvYmFsU2lnbiBLLksuMRMwEQYDVQQLDApUcnVzdExvZ2luMREwDwYDVQQDDAhoZXJwLWluYzAeFw0yMzA4MTQwNTI3MzhaFw0zMzA4MTQwNTI3MzhaMFMxCzAJBgNVBAYTAkpQMRwwGgYDVQQKDBNHTU8gR2xvYmFsU2lnbiBLLksuMRMwEQYDVQQLDApUcnVzdExvZ2luMREwDwYDVQQDDAhoZXJwLWluYzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAL4PFYG7j0DKWgMkQ08Nz9z2f/WTFh6xUi3Pk5jHyIPQbwTW0TdPfXV8Gj5gl//nH+KMILS8WJHnY9RDUK0hkqUBAZLiX/HUDMcwOZ+8OPZSl/imtJTtoTEE6hBJEiT0sgtxS0bOI1TWdhnWiNMDO+Acp+K+0m8AGz9BgOyUT0TJN9wstyfpgASEK2Oy6VNhYfeCQ4QU47aZDhq/1Ei01wWLlIWZ4uZl9uzcROnKsmEmbkT+uAf5tcLG9PjR+XB58AiMvpufhvjsfnp9qVBTiTmgup9zbYfhzkNTKmr1Axi/FES5j7lfroWCSbr4dDw4S8wxTuUlF1v77PQ+/DFXVWMCAwEAATANBgkqhkiG9w0BAQsFAAOCAQEAc4EQUPj2jCUjOJZX4UGvvxu8jA+vjcC5armcZTSno+4AZ73JbZRU1NwxqzY+13W+m9DKN8bj6gRTHQeWptawrigABVROyNaYNeI+2wXcdf2eSqaS7pMSYKEzPAeSEk1ZPwh1/OvmmZLHhvSVaVxuou32x1NM6UaXIQlUUsS8x3NDETfTxDzSisusF5aCJxu9ONXCLUd3s75xn8zXPv1OluEJ1pcq6xHwaoDvwXV2l5eQdWvbZCulKQAy8dnUx0JOBFQg6UamxlKOKKNdOa6FU7M8wpZx0cNbCEME+KPthUFIIxHIvv0mOjPfJFzjCKicUHsVpdtQixe1L9gyLEtKTg==</X509Certificate>
+ </X509Data>
</KeyInfo>
</ds:Signature>
<Subject>
This difference, however, is already present in the pre-canonicalised XML. The problem is therefore somewhere in the code that parses the XML and removes the signature.
The removal of the namespace in that one place is surprising since we set psRetainNamespaces = True
in parseSettings
. I assume that the xml-conduit
library thinks it is OK to remove since the KeyInfo
element doesn't specify the namespace (correctly?). Compare e.g. Okta's vs TrustLogin's:
diff --git a/keyinfo-okta.xml b/keyinfo-trustlogin.xml
index e1a546f..baf6e16 100644
--- a/keyinfo-okta.xml
+++ b/keyinfo-trustlogin.xml
@@ -1,4 +1,4 @@
-<ds:KeyInfo>
+<KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#">
<ds:X509Data>
</ds:X509Data>
-</ds:KeyInfo>
+</KeyInfo>
I have had a quick look to see at what point we lose the ns
prefix and it seems like the Document
values we have still have it, but it disappears when we render the Document
back to a ByteString
with:
let renderedXml = XML.renderLBS def docMinusSignature
I am guessing that renderLBS
discards namespaces like parseLBS
does by default, except that there isn't an equivalent of the psRetainNamespaces
setting (as far as I can see).
Ohhh I see, nice find!
Chiming in from https://github.com/snoyberg/xml/issues/190 .
It seems to me that both the library-generated and the manually-generated XMLs have the same semantics:
My understanding of the specification for XML namespaces is that:
<Signature>
tag binds the namespace http://www.w3.org/2000/09/xmldsig#
to the prefix ds
; this binding is available within the scope of that tag<KeyInfo>
tag binds the same namespace http://www.w3.org/2000/09/xmldsig#
to no prefix, effectively making it the default namespace within its scope ; as a consequence, children elements are implicitly considered to be within that default namespace, and the prefix ds
is redundant (highlighted in blue in the above screenshot)Am I mistaken ?
Relevant part of the specification:
The scope of a default namespace declaration extends from the beginning of the start-tag in which it appears to the end of the corresponding end-tag, excluding the scope of any inner default namespace declarations. In the case of an empty tag, the scope is the tag itself.
A default namespace declaration applies to all unprefixed element names within its scope. Default namespace declarations do not apply directly to attribute names; the interpretation of unprefixed attributes is determined by the element on which they appear.
If there is a default namespace declaration in scope, the expanded name corresponding to an unprefixed element name has the IRI of the default namespace as its namespace name. If there is no default namespace declaration in scope, the namespace name has no value. The namespace name for an unprefixed attribute name always has no value.
@k0ral thanks for chiming in!
Unfortunately, the equivalent semantics of the two documents are irrelevant here. The part of the SAML2 standard we are implementing here works roughly like this:
xml-conduit
library), normalise the resulting document (using c14n), and compute a hash. That hash must match the hash contained in the signature.Therefore it is critical to the successful implementation of the SAML2 standard that we arrive with exactly the XML document that the identity provider used to compute the hash, because otherwise we end up with a different hash and the validation process fails.
TL;DR: We care about textual equivalence of XML documents, not semantic equivalence.
If I understand correctly, the (exclusive ?) c14n standard does not have the following property:
If A and B are semantically-equal (but possibly textually-distinct) XML documents, then c14n(A) is textually-equal to c14n(B).
Is that correct ?
According to @mbg the problem occurs within renderLBS
, so c14n itself kept the redundant namespace.
If A and B are semantically-equal (but possibly textually-distinct) XML documents, then c14n(A) is textually-equal to c14n(B).
Hence I think this does not hold because of the observation above.
@fumieval We call renderLBS
before we pass the XML to c14n since the c14n implementation takes the textual representation as input. (That's a little inefficient of course since it has to re-parse the XML document at that point, but there isn't a good c14n implementation in pure Haskell so we have to throw it at libxml
's implementation.)
If, as @k0ral notes, c14n-exclusive is supposed to produce equivalent output for semantically equivalent XML documents, then there are two possibilities as far as I can see:
renderLBS
produces a textual representation of the document that's not semantically equivalent to the manually modified document)libxml
is wrong But I'll look into this more to establish more clearly which is the case.
Summary
I was trying a new Identity Provider called TrustLogin, and realised that it fails to validate the digest.
It passes other validation tools like https://samltool.io/ so I suspect something is not quite right in the canonicalisation phase, but I didn't manage to figure out why. Any ideas?
Apparently it is able to validate an assertion (using #45), but fails to validate an entire response.
Checklist
@since
annotations.