python / cpython

The Python programming language
https://www.python.org
Other
62.42k stars 29.97k forks source link

SSLSocket.getpeercertchain() #62433

Open tiran opened 11 years ago

tiran commented 11 years ago
BPO 18233
Nosy @jcea, @pitrou, @tiran, @ashemedai, @njsmith, @mmaker, @dstufft, @dsoprea, @miki725, @mmasztalerczuk, @chet, @joernheissler, @chrisburr, @JustAnotherArchivist, @jamespo
PRs
  • python/cpython#17938
  • python/cpython#25467
  • Dependencies
  • bpo-18147: SSL: diagnostic functions to list loaded CA certs
  • Files
  • ssl_peerchertchain.patch
  • ssl_peerchertchain2.patch
  • ssl_peercertchain3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/tiran' closed_at = None created_at = labels = ['expert-SSL', 'type-feature', '3.10'] title = 'SSLSocket.getpeercertchain()' updated_at = user = 'https://github.com/tiran' ``` bugs.python.org fields: ```python activity = actor = 'jamespo' assignee = 'christian.heimes' closed = False closed_date = None closer = None components = ['SSL'] creation = creator = 'christian.heimes' dependencies = ['18147'] files = ['30611', '30622', '45057'] hgrepos = [] issue_num = 18233 keywords = ['patch'] message_count = 32.0 messages = ['191287', '191352', '193424', '199434', '199443', '199446', '199447', '203187', '203189', '203190', '205734', '244312', '244313', '244314', '278494', '280348', '293570', '293577', '293580', '293590', '293778', '301525', '324917', '357540', '361076', '361085', '361093', '361095', '361120', '372627', '372673', '391916'] nosy_count = 20.0 nosy_names = ['jcea', 'pitrou', 'christian.heimes', 'asmodai', 'njs', 'maker', 'Hiroaki.Kawai', 'underrun', 'zwol', 'dstufft', 'dsoprea', 'miki725', 'mmasztalerczuk', 'chet', 'joernheissler', 'chaen', 'chrisburr', 'kwatsen', 'JustAnotherArchivist', 'jamespo'] pr_nums = ['17938', '25467'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue18233' versions = ['Python 3.10'] ```

    tiran commented 11 years ago

    The patch implements a method getpeercertchain() on a SSLSocket. It returns the peer's certificate chain from the leaf cert to the root cert if available. It wraps SSL_get_peer_cert_chain().

    SSL_get_peer_cert_chain() doesn't have to pull any additional data from the peer. The information is already exchanged for cert validation.

    tiran commented 11 years ago

    As expected it is much harder to get the full certification chain from OpenSSL than I initially expected. SSL_get_peer_cert_chain() doesn't return the root CA's certificate. The new patch introduces a validation mode and uses X509_verify_cert(*X509_STORE_CTX) + X509_STORE_CTX_get1_chain() to build a full chain.

    tiran commented 11 years ago

    From Rietveld review:

    --- http://bugs.python.org/review/18233/diff/8422/Modules/_ssl.c#newcode1203 Modules/_ssl.c:1203: chain = X509_STORE_CTX_get1_chain(store_ctx); This isn't appropriate for this method. Specifically, you are asking for the peer cert chain, which purposefully does not include root CA certs that you trust. What you are giving here a complete validate chain from a peer cert to a trusted root. This is a valuable piece of information, but should be returned via another method (perhaps exposed in python as get1chain in SSLContext). But this method should always return the result of SSL_get_peer_cert_chain if a peer cert chain is available. ---

    You are making a good point. I'm either going to split it up into two function or provide a way to look up a cert by issuer.

    19b501b3-61b4-4c31-8f12-6352888b579d commented 10 years ago

    I was about to submit a feature request to add exactly this. The [second] patch works like a charm. When are you going to land on a particular resolution so that it can get committed in?

    Dustin

    pitrou commented 10 years ago

    The patch needs a test, a proper doc, and reviewing.

    pitrou commented 10 years ago

    Sorry for the incorrect answer. I just noticed there was a test in the patch! Further looking at it, I notice the new function is returning a tuple. Wouldn't it be better to return a list here?

    19b501b3-61b4-4c31-8f12-6352888b579d commented 10 years ago

    My two-cents is to leave it a tuple (why not?).

    Dustin

    tiran commented 10 years ago

    I'd rather return a list or tuple of X509 objects but bpo-18369 won't be ready for 3.4. Ideas?

    pitrou commented 10 years ago

    @Dustin

    My two-cents is to leave it a tuple (why not?).

    Because tuples are more used for struct-like data. Here we are returning an unknown number of homogenous objects, which generally calls for a list.

    @Christian

    I'd rather return a list or tuple of X509 objects but bpo-18369 won't be ready for 3.4. Ideas?

    Unless the feature is really important to have right now, I think it's ok to defer it until we have X509 objects.

    tiran commented 10 years ago

    It's just nice to have for debugging and extended verification.

    e3bb7a8a-6d23-40c5-b27f-44cffc49d48a commented 10 years ago

    I could really use this sooner than later... and sometimes having a full-featured (or even secure) interface is not what you want.

    Consider zmap and masscan etc and ssl mapping (similar to what the EFF did a couple years back - https://www.eff.org/observatory - but with the full chain instead of just the cert). The desire here would be low level, low overhead, no validation on the fly: All you want is the cert chain.

    There are plenty of research and security applications where a simple wrapper around OpenSSL that returns DER bytes would be desirable. Please reconsider this patch for inclusion in 3.4 ...

    c379e6df-6daa-45d5-a8ee-f828d234d3ca commented 9 years ago

    Given that cryptography.io is fast becoming the solution for dealing with X.509 certificates on Python, I would like to add my vote to add my vote for this feature. Right now, getting the full chain in DER is what I am missing to complete a task at work.

    19b501b3-61b4-4c31-8f12-6352888b579d commented 9 years ago

    Forget it. This project is dead.

    Dustin On May 28, 2015 11:58 AM, "Jeroen Ruigrok van der Werven" \< report@bugs.python.org> wrote:

    Jeroen Ruigrok van der Werven added the comment:

    Given that cryptography.io is fast becoming the solution for dealing with X.509 certificates on Python, I would like to add my vote to add my vote for this feature. Right now, getting the full chain in DER is what I am missing to complete a task at work.

    ---------- nosy: +asmodai


    Python tracker \report@bugs.python.org\ \http://bugs.python.org/issue18233\


    19b501b3-61b4-4c31-8f12-6352888b579d commented 9 years ago

    Disregard. I thought this was something else.

    6dca6041-e2de-4fba-9f7d-a1e687f3d9db commented 7 years ago

    Hello :)

    I'm not sure why patches created by christian.heimes is not merged to python, but because last patch was created in 2013, I've created a new version of this patch.

    What do you think about it?

    6dca6041-e2de-4fba-9f7d-a1e687f3d9db commented 7 years ago

    ping! :) Could someone look at my changes? :)

    0d566da0-61f3-4970-b2d0-eb7b3c639ea9 commented 7 years ago

    Is this dead at this point? Just stumbled upon it, and I'm hopeful that maybe there's still a chance, since it's still open. :)

    tiran commented 7 years ago

    The ticket is dead for a very good reason. Past me was not clever enough and didn't know about the difference between the cert chain sent by the peer and the actual trust chain. The peer's cert chain is not trustworthy and must *only* be used to build the actual trust chain. X.509 chain trust chain construction is a tricky business.

    Although I thought that peer cert chain is a useful piece of information, it is also dangerous. It's simply not trustworthy. In virtually all cases you want to know the chain of certificates that leads from a local trust anchor to the end-entity cert. In most cases it just happens to be the same (excluding root CA). But that's not reliable.

    19b501b3-61b4-4c31-8f12-6352888b579d commented 7 years ago

    Thanks for expounding on this, Christian. Assuming your assertions are correct, this makes perfect sense.

    Can anyone listening close this?

    On May 12, 2017 17:45, "Christian Heimes" \report@bugs.python.org\ wrote:

    Christian Heimes added the comment:

    The ticket is dead for a very good reason. Past me was not clever enough and didn't know about the difference between the cert chain sent by the peer and the actual trust chain. The peer's cert chain is not trustworthy and must *only* be used to build the actual trust chain. X.509 chain trust chain construction is a tricky business.

    Although I thought that peer cert chain is a useful piece of information, it is also dangerous. It's simply not trustworthy. In virtually all cases you want to know the chain of certificates that leads from a local trust anchor to the end-entity cert. In most cases it just happens to be the same (excluding root CA). But that's not reliable.

    ----------


    Python tracker \report@bugs.python.org\ \http://bugs.python.org/issue18233\


    0d566da0-61f3-4970-b2d0-eb7b3c639ea9 commented 7 years ago

    Oh yeah, definitely not trustworthy at all. In my case, I am not processing the peer chain to actually verify trust, but I am still interested in inspecting the chain.

    Dangerous or not, and regardless of what almost all people should *actually* be doing, SSL_get_peer_cert_chain exists for a reason, just like SSL_get_peer_certificate exists for a reason. If Python includes a standard SSL library, it should be transparent in the interface it offers, for the mere reason that the library becomes more powerful.

    If the overall consensus is that the library should protect most people against common pitfalls and security mistakes, then I guess that's the route to continue on. However, I would be disappointed that we would be blacklisting the exposure of underlying library features based on the mere belief that people don't understand them enough!

    11ae746e-d782-4f33-a097-758c9f315fb2 commented 7 years ago

    Hi, I'd like to see this feature too.

    My use case is a monitoring script to check the life time of the server certificate, including the chain. I would prefer to have a wrapper around SSL_get_peer_cert_chain. I understand that this is *not* a verified chain. That's okay.

    openssl-1.1 added a new function SSL_get0_verified_chain which may be safer for most applications. Is there any real difference to X509_STORE_CTX_get1_chain?

    If you're worried about people misusing these functions, add a warning in the docs and point them to "get_peer_verified_chain"?

    tiran commented 7 years ago

    Yes, from an application perspective there is an import difference between X509_STORE_CTX_get1_chain() and SSL_get0_verified_chain(). X509_STORE_CTX is a temporary object. It is only available during the handshake and while the trust chain is built and verified. Once the chain is verified, it is no longer available.

    SSL_get0_verified_chain() sounds like an actual good solution. Thanks for pointing it out.

    fad9a254-cf26-446b-b2d9-9750068aa533 commented 6 years ago

    There is another very valid use case which is even described by an RFC: https://www.ietf.org/rfc/rfc3820.txt

    And openssl supports this RFC.

    These proxy certificates are heavily used in the world of high energy physics computing, and having the get_peer_cert_chain exposed in python would allow to use standard tools.

    And any hope this would also be backported to the python 2.7 branch ?

    njsmith commented 4 years ago

    There's another important use case for this, that hasn't been discussed here. If you want to use openssl for TLS + the system trust store to verify certificates, then you need to disable openssl's certificate validation, perform the handshake, and then extract the certificate chain that there peer sent and pass it to the system native APIs to validate.

    For this case, we don't need to do any validation or resolution on the chain – we just want to pull out the DER that the peer sent. AFAICT, the lack of this functionality is the one major blocker to using the system trust store with the 'ssl' module.

    51fd0af7-8993-4abf-91c3-35b477cb4794 commented 4 years ago

    I don't understand the concern issues being raised for this patch, and also may have a use-case not mentioned yet.

    For the concern issue, as I understand it, the ability to call getpeercert() or the proposed getpeercertchain() is only after the TLS session has been established. As such, the SSL socket already established that there exists a valid chain of trust. Thus these methods are primarily to provide visibility into what the peer passed *after* it had been authenticated, right?

    That said, the reason I want to access the entire certificate chain passed by the client (i.e., client cert auth) is in order to validate that the client's cert (which may include some intermediates) authenticates to a specific trust anchor, rather than the bag of trust anchors loaded into the SSLContext (via load_verify_locations()) in order to authenticate a multiplicity of clients, each potentially leading to a different trust anchor.

    Not having getpeercertchain() means that all no client cert may contain a chain, i.e., the clients only ever transmit the end-entity cert itself. This is unfortunate because the underlying SSL socket actually allows clients to send chains, it's just the lack being able to access the peercertchain in my code that seems to limit the solution.

    njsmith commented 4 years ago

    For the concern issue, as I understand it, the ability to call getpeercert() or the proposed getpeercertchain() is only after the TLS session has been established. As such, the SSL socket already established that there exists a valid chain of trust. Thus these methods are primarily to provide visibility into what the peer passed *after* it had been authenticated, right?

    The issue is that "the cert chain provided by the client" and "the cert chain that was validated as trustworthy" might be different. For example, say you have trust roots A and B, and I have a leaf certificate with a valid signature by B. If I pass the chain [A, leaf cert], then openssl's trust validation code might look at that and say "yep, this leaf cert is signed by root B, I trust root B, cool, the connection is good. Also there's a random A cert here, but whatever, that doesn't change anything".

    In this case, for your use case, you want to get back the chain [B, leaf cert], because that's the chain that was actually validated. If you instead got the chain the client gave you, then you might be fooled into thinking that this cert chained to root A, when it actually didn't, and make the wrong trust decision.

    That's why we need to be careful to distinguish between these two possible chains.

    51fd0af7-8993-4abf-91c3-35b477cb4794 commented 4 years ago

    It seems that we're talking about the same thing, but I want the cert-chain the peer sent without any smarts, exactly how OpenSSL's SSL_get_peer_cert_chain() works and, importantly, without stapling any root chain certs the client did not send itself (though it's okay if the client did, in which case those certs should be included).

    I'm not following your "I pass the chain [A, leaf cert]" comment, if leaf-cert is signed by B, then this should obviously fail. Maybe you meant to say that A and B are loaded into a bag and that validation test is [bag, leaf-cert]?

    Regardless, I don't think Python should coddle developers. Assuming the docs are accurate, competent developers with crypto-clue will be fine. Many crypto library docs encourage tourists to stay away. That said, if smarts are wanted, let's choose a name that doesn't overlap with the existing OpenSSL name...get_authed_cert_chain() ?

    But, please, can a "peer_cert_chain()" wrapping the OpenSSL call be release ASAP, buying time to ponder the merits of smart calls for another day?

    njsmith commented 4 years ago

    I'm not sure I agree about assuming that users will be able to work around these issues... I mean, nothing personal, I'm sure you're well-informed and maybe your code would work fine, but if you don't understand my example then how can you be entirely confident that you really understand all the risks that your code needs to watch out for?

    Anyway, the proposed PR exposes both methods, so that's not a problem; it's just waiting on someone with openssl expertise like Christian to review it.

    51fd0af7-8993-4abf-91c3-35b477cb4794 commented 4 years ago

    I agree that having both would be best, but there is a world of difference between a must-have (peer_cert_chain) and what seems to be a nice-to-have (authed_peer_cert_chain).

    My request for clarification was not that I don't understand bags, etc. (see my first message), but that I don't understand the concrete use case in mind. That is, when is it that the app-logic would differ because the EE cert validated using one path versus another?

    To explain the 'must-have' better, imagine one peer sending [A, B, C], where 'A' is the EE cert, and the other peer having TA [F, E, D], where 'F' is the self-signed root TA and 'D' is the Issuer that signed 'C'. The complete chain is [A-F] and this is what the SSL-level code will use during the handshake. But post-handshake, without peer_chain_cert(), there is NO WAY for the app-logic to create a valid chain. This is broken, for the reason mentioned in my first message.

    f996885c-52e8-4df9-9d2c-007fec093be8 commented 4 years ago

    I have yet another use case for the function implemented by this patch (i.e. retrieving the cert chain actually sent by the server, regardless of whether that gives a path to a trust anchor). I'm implementing a network forensics tool, and one of the situations it's supposed to detect is when a man-in-the-middle is attempting to substitute its own cert for a site's "legitimate" cert (yes, possibly having suborned a public CA in order to do so). To make all of the planned heuristics for this work correctly, I need to record exactly what came over the wire.

    If it would be useful for me to dust off the patch and/or implement the _other_ function that people requested (retrieve the chain that OpenSSL concluded was a valid chain to an accepted trust anchor) I can probably scare up time to do so in the next week or two. I imagine it's too late for 3.8 patch releases at this point, but assuming I did this, could it make 3.9?

    e13d14b5-4000-4a56-9d98-fcad0cf884a2 commented 4 years ago

    Hi Zack, I've already opened a PR that is loosely based on this patch. If you have time to give it a review I'd appreciate the extra set of eyes.

    https://github.com/python/cpython/pull/17938

    tiran commented 3 years ago

    New changeset 666991fc598bc312d72aff0078ecb553f0a968f1 by Christian Heimes in branch 'master': bpo-18233: Add internal methods to access peer chain (GH-25467) https://github.com/python/cpython/commit/666991fc598bc312d72aff0078ecb553f0a968f1

    sigmavirus24 commented 1 year ago

    @tiran let me know if I can lend a hand here. I'm also interested in finding a way to "backport" this to supported pythons even if as a PyPI module

    felixfontein commented 9 months ago

    This apparently has been superseded by https://github.com/python/cpython/issues/109109.