python / cpython

The Python programming language
https://www.python.org
Other
63.83k stars 30.55k forks source link

multiprocessing module performs a time-dependent hmac comparison #58737

Closed 7f89768c-2f87-475b-af62-c847a950163e closed 12 years ago

7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago
BPO 14532
Nosy @loewis, @ncoghlan, @pitrou, @vstinner, @bitdancer, @hynek
Files
  • hmac-time-independent-v1.patch
  • hmac-time-independent-v2.patch
  • hmac-time-independent-v3.patch
  • hmac-time-independent-v4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['library'] title = 'multiprocessing module performs a time-dependent hmac comparison' updated_at = user = 'https://bugs.python.org/JonOberheide' ``` bugs.python.org fields: ```python activity = actor = 'hynek' assignee = 'none' closed = True closed_date = closer = 'neologix' components = ['Library (Lib)'] creation = creator = 'Jon.Oberheide' dependencies = [] files = ['25186', '25197', '25262', '25414'] hgrepos = [] issue_num = 14532 keywords = ['patch'] message_count = 34.0 messages = ['157809', '157837', '157981', '158012', '158021', '158032', '158033', '158034', '158035', '158038', '158040', '158044', '158045', '158075', '158083', '158103', '158105', '158129', '158131', '158133', '158134', '158170', '158656', '158745', '158747', '158963', '159668', '159747', '159759', '160537', '160598', '162587', '162772', '162774'] nosy_count = 10.0 nosy_names = ['loewis', 'ncoghlan', 'pitrou', 'vstinner', 'r.david.murray', 'neologix', 'python-dev', 'sbt', 'hynek', 'Jon.Oberheide'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue14532' versions = [] ```

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    The multiprocessing module performs a time-dependent comparison of the HMAC digest used for authentication:

    def deliver_challenge(connection, authkey):
        import hmac
        assert isinstance(authkey, bytes)
        message = os.urandom(MESSAGE_LENGTH)
        connection.send_bytes(CHALLENGE + message)
        digest = hmac.new(authkey, message).digest()
        response = connection.recv_bytes(256)        # reject large message
        if response == digest:
            connection.send_bytes(WELCOME)
        else:
            connection.send_bytes(FAILURE)
            raise AuthenticationError('digest received was wrong')

    This comparison should be made time-independent as to not leak information about the expected digest and allow an attacker to derive the full digest.

    More info on such timing attacks:

    http://rdist.root.org/2009/05/28/timing-attack-in-google-keyczar-library/ http://rdist.root.org/2010/07/19/exploiting-remote-timing-attacks/

    e26428b1-70cf-4e9f-ae3c-9ef0478633fb commented 12 years ago

    I only looked quickly at the web pages, so I may have misunderstood.

    But it sounds like this applies when the attacker gets multiple chances to guess the digest for a *fixed* message (which was presumably chosen by the attacker).

    That is not the case here because deliver_challenge() generates a new message each time. Therefore the expected digest changes each time.

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    Therefore the expected digest changes each time.

    Exactly. Timing attacks (which aren't really new :-) depend on a constant digest to successively determine the characters composing the digest. Here, that won't work, because the digest changes every time.

    vstinner commented 12 years ago
    if response == digest:
    can be replaced by:
        if sum(x^y for x, y in itertools.zip_longest(response, digest,
    fillvalue=256)) == 0:

    I hope that zip_longest() does not depend too much on response and digest.

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    if response == digest: can be replaced by:    if sum(x^y for x, y in itertools.zip_longest(response, digest, fillvalue=256)) == 0:

    Yeah, sure, but is it useful at all? The digest changes at every connection attempt, so this should not be exploitable.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    Ah yeah, I suppose it's not be exploitable in this case due to the challenge nonce.

    However, it might still be a good thing to fix for to set an example for other hmac module users (internal or external) that might not have the same situation.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    In fact, it'd probably be useful to have a time_independenct_comparison() helper function somewhere in general.

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    I don't see the point of obfuscating the code to avoid a vulnerability to which the code is not even vulnerable, just so that it can be used as example... There are *thousands* of ways to introduce security flaws, and the Python code base if not a security handbook ;-)

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    You call it obfuscating, I call it security correctness and developer education. Tomayto, tomahto. ;-)

    Anywho, your call of course, feel free to close.

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    You call it obfuscating, I call it security correctness and developer education. Tomayto, tomahto. ;-)

    Well, I'd be prompt to changing to a more robust digest check algorithm if the current one had a flaw, but AFAICT, it's not the case (but I'm no security expert).

    Anywho, your call of course, feel free to close.

    Being a core Python developer doesn't mean I'm right ;-)

    I just don't think that "set an example for other hmac module users" is a good reason on its own to complicate the code, which is currently readable and - AFICT - safe (complexity usually introduces bugs). Furthermore, I somehow doubt that hmac users will go and have a look at the multiprocessing connection challenge code when looking for an example.

    One thing that could definitely be interesting is to look through the code base and example to see if a similar - but vulnerable - pattern is used, and fix such occurrences.

    e26428b1-70cf-4e9f-ae3c-9ef0478633fb commented 12 years ago

    I think it would be reasonable to add a safe comparison function to hmac. Its documentation could explain briefly when it would be preferable to "==".

    bitdancer commented 12 years ago

    It would also be reasonable to add a comment to the code mentioning why this particular (security) comparison is *not* vulnerable to a timing attack, which would serve the education purpose if someone does look at the code.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    One thing that could definitely be interesting is to look through the code base and example to see if a similar - but vulnerable - pattern is used, and fix such occurrences.

    Based on some quick greps, I didn't see many internal users of hmac and the current users don't seem to use it in a scenario that would be at risk (eg. attacker supplied digest compared against an expected digest).

    Given that this issue has affected a lot of security-sensitive third-party code (keyczar, openid providers, almost every python web service that implements "secure cookies" [1] or other HMAC-based REST API signatures), I do like the idea of adding a warning in the relevant documentation as sbt proposed.

    The only reason I'd recommend _not_ putting a time_independent_comparison() function in the hmac module is that it's not really HMAC-specific. In practice, any fixed-length secrets should be compared in a time-independent manner. It just happens that HMAC verification is a pretty common case for this vulnerable construct. :-)

    [1] https://github.com/facebook/tornado/blob/master/tornado/web.py#L1981

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    Given that this issue has affected a lot of security-sensitive third-party code (keyczar, openid providers, almost every python web service that implements "secure cookies" [1] or other HMAC-based REST API signatures), I do like the idea of adding a warning in the relevant documentation as sbt proposed.

    This does sound reasonable, along with the addition of a comparison function immune to timing attacks to the hmac module (as noted, it's not specific to hmac, but it looks like a resonable place to add it). Would you like to submit a patch (new comparison function with documentation and test)?

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    Will do!

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    Here's a v1. Works with both str and bytes types for Python 3.x.

    Not sure I'm completely happy with the docs, but I'd appreciate any feedback on them!

    vstinner commented 12 years ago

    +def time_independent_equals(a, b): + if len(a) != len(b): + return False

    This is not time independent. Is it an issue?

    + if type(a[0]) is int:

    It's better to write isinstance(a, bytes). You should raise a TypeError if a is not a bytes or str.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    This is not time independent. Is it an issue?

    You're correct, the length check does leak the length of the expected digest as a performance enhancement (otherwise, your comparison runtime is bounded by the length of the attackers input).

    Generally, exposing the length and thereby potentially the underlying cryptographic hash function (eg. 20 bytes -> hmac-sha1) is not considered a security risk for this type of scenario, whereas leaking key material certainly is. I considered including this nuance in the documentation and probably should.

    It's better to write isinstance(a, bytes). You should raise a TypeError if a is not a bytes or str.

    Ack, thanks.

    pitrou commented 12 years ago

    You could rewrite:

    result |= x ^ y

    as:

    result |= (x != y)

    Of course, this assumes that the "!=" operator is constant-time for 1-element strings.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    You could rewrite:

    result |= x ^ y

    as:

    result |= (x != y)

    You could, but it's best not to introduce any conditional branching based if at all possible. For reference, see:

    http://rdist.root.org/2009/05/28/timing-attack-in-google-keyczar-library/#comment-5783

    e26428b1-70cf-4e9f-ae3c-9ef0478633fb commented 12 years ago

    Why not just

        def time_independent_equals(a, b):
            return len(a) == len(b) and sum(x != y for x, y in zip(a, b)) == 0
    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    Here's a v2 patch. Changes include checking the input types via isinstance, test cases to exercise the type checking, and a note documenting the leak of the input length.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    v3 patch, based on feedback from the review here: http://bugs.python.org/review/14532/show

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    v3 patch, based on feedback from the review here: http://bugs.python.org/review/14532/show

    Looks good to me. One last thing (sorry for not bringing this up earlier): I don't like bikeshedding, but at least to me, time_independent_equals is a bit too long to type, and sounds reductive (we don't want to specifically avoid only timing attacks, but provide a way to compare digests securely). What do you (all) think of something shorter, like secure_compare, secure_equals, or something along those lines? Note that I'm not good at finding names, so if others are fine with the current one, I won't object ;-)

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    I have used the name "secure_compare" in the past for such a function. That said, I don't have strong feelings either way about the naming, so I'll yield to the others.

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    I have used the name "secure_compare" in the past for such a function. That said, I don't have strong feelings either way about the naming, so I'll yield to the others.

    I prefer this name too. Wait one day or two (to let others chime in if they want), and upload a new patch with that change :-)

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    Ok, patch v4 uploaded. Only change is the rename to "secure_compare".

    vstinner commented 12 years ago

    However, this generally is not a security risk.

    You should explain what you already said: it is not a risk because the length of a HMAC is fixed.

    7f89768c-2f87-475b-af62-c847a950163e commented 12 years ago

    You should explain what you already said: it is not a risk because the length of a HMAC is fixed.

    Well, that's not entirely accurate. Exposing the length of the HMAC can expose what underlying hash is being used (eg. HMAC-SHA1 has different length than HMAC-MD5). It's generally not considered a risk since exposing the algorithm being used shouldn't impact your security (unless you're doing it very wrong).

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 12 years ago

    New changeset ddcc8ee680d7 by Charles-François Natali in branch 'default': Issue bpo-14532: Add a secure_compare() helper to the hmac module, to mitigate http://hg.python.org/cpython/rev/ddcc8ee680d7

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    Committed. Jon, thanks for your patch and your patience!

    ncoghlan commented 12 years ago

    A comment above the length check referring back to this issue and the deliberate decision to allow a timing attack to determine the length of the expected digest would be handy.

    I was just looking at hmac.secure_compare and my thought when reading the source and the docstring was "No, it's not time-independent, you can still use a timing attack to figure out the expected digest length".

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 12 years ago

    I recommend to revert the addition of this function. It's not possible to implement a time-independent comparison function, as demonstrated in issues 14955 and 15061

    pitrou commented 12 years ago

    How is it "not possible"? The implementation may be buggy, but it's possible to write a C version that does the right thing.