python / cpython

The Python programming language
https://www.python.org
Other
62.49k stars 30k forks source link

[security] CVE-2021-29921: ipaddress Should not reject IPv4 addresses with leading zeroes as ambiguously octal #80565

Closed 8f609f27-1bf9-4a5d-8519-ba5ad7c0cb8f closed 3 years ago

8f609f27-1bf9-4a5d-8519-ba5ad7c0cb8f commented 5 years ago
BPO 36384
Nosy @ncoghlan, @vstinner, @ericvsmith, @tiran, @ned-deily, @ambv, @mgorny, @Julian, @serhiy-storchaka, @zooba, @pablogsal, @miss-islington, @TV4Fun, @gcbirzan, @achraf-mer
PRs
  • python/cpython#12577
  • python/cpython#25099
  • python/cpython#25815
  • python/cpython#27801
  • python/cpython#27824
  • python/cpython#27825
  • python/cpython#27826
  • python/cpython#27827
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-security', 'deferred-blocker', '3.8', '3.9', '3.10', 'library'] title = '[security] CVE-2021-29921: ipaddress Should not reject IPv4 addresses with leading zeroes as ambiguously octal' updated_at = user = 'https://github.com/TV4Fun' ``` bugs.python.org fields: ```python activity = actor = 'lukasz.langa' assignee = 'none' closed = True closed_date = closer = 'pablogsal' components = ['Library (Lib)'] creation = creator = 'Joel Croteau' dependencies = [] files = [] hgrepos = [] issue_num = 36384 keywords = ['patch', '3.8regression', '3.9regression'] message_count = 40.0 messages = ['338514', '338694', '339204', '339205', '339206', '339210', '339211', '339572', '389826', '389847', '390046', '390128', '390331', '390340', '390353', '390361', '392423', '392572', '392684', '392692', '392696', '392697', '394162', '394316', '394322', '394327', '394379', '394380', '394390', '399784', '399798', '399801', '399803', '399804', '399805', '399809', '399893', '399895', '399898', '399899'] nosy_count = 18.0 nosy_names = ['ncoghlan', 'vstinner', 'eric.smith', 'christian.heimes', 'ned.deily', 'pmoody', 'docs@python', 'lukasz.langa', 'mgorny', 'Julian', 'python-dev', 'serhiy.storchaka', 'steve.dower', 'pablogsal', 'miss-islington', 'Joel Croteau', 'gc2', 'achraf-mer'] pr_nums = ['12577', '25099', '25815', '27801', '27824', '27825', '27826', '27827'] priority = 'deferred blocker' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'security' url = 'https://bugs.python.org/issue36384' versions = ['Python 3.8', 'Python 3.9', 'Python 3.10'] ```

    8f609f27-1bf9-4a5d-8519-ba5ad7c0cb8f commented 5 years ago

    I understand to a certain extent the logic in not allowing IPv4 octets that might ambiguously be octal, but in practice, it just seems like it creates additional parsing hassle needlessly. I have never in many years of working on many networked systems seen anyone use dotted octal format, it is actually specifically forbidden by RFC 3986 (https://tools.ietf.org/html/rfc3986#section-7.4), and it means that the ipaddress module throws exceptions on many perfectly valid IP addresses just because they have leading zeroes. Since the module doesn't support dotted octal or dotted hex anyway, this check seems a little pointless. If nothing else, there should be a way to disable this check by specifying that your IPs are in fact dotted decimal, otherwise it seems like it's just making you have to do extra parsing work or just write your own implementation.

    ericvsmith commented 5 years ago

    I agree that this is not a useful check.

    ncoghlan commented 5 years ago

    New changeset e653d4d8e820a7a004ad399530af0135b45db27a by Nick Coghlan (Joel Croteau) in branch 'master': bpo-36384: Remove check for leading zeroes in IPv4 addresses (GH-12577) https://github.com/python/cpython/commit/e653d4d8e820a7a004ad399530af0135b45db27a

    ncoghlan commented 5 years ago

    I've merged the change for Python 3.8 (thanks Joel!).

    I'm not sure whether to classify it as an enhancement or as an interoperability bug fix, though, so I've put the status to pending and added Ned to the nosy list to get his thoughts as the Python 3.7 RM.

    ned-deily commented 5 years ago

    ipaddress is behaving as documented:

    "The following constitutes a valid IPv4 address:

    A string in decimal-dot notation, consisting of four decimal integers in the inclusive range 0–255, separated by dots (e.g. 192.168.0.1). Each integer represents an octet (byte) in the address. Leading zeroes are tolerated only for values less than 8 (as there is no ambiguity between the decimal and octal interpretations of such strings). [...]"

    https://docs.python.org/3/library/ipaddress.html

    I can sort of understand imposing that restriction in a Python 2 world where leading zeros implied octal and Python 3 outright rejects such forms of integers to avoid the ambiguity. That said, there's no particular reason why the components of an IPv4 string acceptable to ipaddress *have* to follow the same rules so I'm +0 on making the change at all. It's a bit of a stretch to consider it a bug when it appears to be behaving as documented but I would expect such a change to fix more problems than causing them so I'm OK if you want to backport it.

    But, in any case, the documentation for 3.8 and/or 3.7 needs to be updated.

    serhiy-storchaka commented 5 years ago

    See also the article "Ping and FTP Resolve IP Address with Leading Zero as Octal" (https://web.archive.org/web/20061206211851/http://support.microsoft.com/kb/115388).

    This is still true in Windows 10.

    So it is safer to reject IPv4 addresses with leading zeros that can be ambiguously interpreted. Otherwise this can create a security hole.

    ericvsmith commented 5 years ago

    I think it should be 3.8 only, and the docs should be updated. Apologies for not catching that earlier: I searched via Google, which was a mistake.

    ncoghlan commented 5 years ago

    The recommended handling in the article that Serhiy mentions is to strip the leading zeroes, which the ipaddress module will still do - it's only being made more tolerant on input. That means it will become usable as a prefilter step (pass string with potentially leading zeroes to ipaddress, get string with no leading zeroes out).

    So that means the one part we missed is the docs update (together with a versionchanged note in the module docs themselves)

    tiran commented 3 years ago

    Serhiy was right, this is a security issue.

    The patch should not have landed in 3.8. At a bare minimum the patch should have been postponed until documentation was updated. Since 3.8 the ipaddresss does not behave as documented. A similar security issue in NPM was published two days ago, CVE-2021-28918.

    I proposed to not only revert the change, but also tighten the check for leading zeros so it behaves like glibc's inet_pton(). It refuses any IPv4 string with a leading zero.

    >>> socket.inet_pton(socket.AF_INET, "01.1.1.1")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: illegal IP address string passed to inet_pton
    >>> socket.inet_pton(socket.AF_INET, "1.1.1.01")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: illegal IP address string passed to inet_pton
    vstinner commented 3 years ago

    The patch should not have landed in 3.8. At a bare minimum the patch should have been postponed until documentation was updated. Since 3.8 the ipaddresss does not behave as documented. A similar security issue in NPM was published two days ago, CVE-2021-28918.

    Link: https://sick.codes/sick-2021-011

    ambv commented 3 years ago

    Deferred the blocker to the next regular release due to lack of activity in time for the current expedited releases.

    zooba commented 3 years ago

    (Copied from my comment on the PR, following the one where I said this was ready to go.)

    Withdrawing the readiness - @ambv and I would prefer to see this behind a flag (probably "strict" parsing), on by default for 3.10, and maybe on by default for 3.9/earlier.

    The main reasoning being that this isn't our vulnerability, but an inconsistency with other vulnerable libraries. The current fix is the best it can be, but it doesn't prevent the vulnerability, it just causes Python to break first. So it ought to be relatively easy to retain the flexible (though admittedly non-sensical) behaviour for those who currently rely on it.

    vstinner commented 3 years ago

    Withdrawing the readiness - @ambv and I would prefer to see this behind a flag (probably "strict" parsing), on by default for 3.10, and maybe on by default for 3.9/earlier.

    Last time we added a new parameter in a stable branch, it didn't go well: https://bugs.python.org/issue42967#msg387638

    zooba commented 3 years ago

    The important quote from the linked issue seems to be:

    Our new separator= parameter does not allow one to achieve the previous behavior if mixing and matching & And ; was intended to be allowed, as it is a single separator rather than a set of separators.

    So arguably, we added _the wrong_ parameter in that case, because it only allowed choosing between behaviours not including the "bad" behaviour. We should've added one that was "give me back the previous behaviour".

    In this case, having it off by default goes further to prevent breakage, and I wouldn't be opposed to a process level opt-in (e.g. a module-level flag), so that _applications_ have a way to force their dependencies to use the safer behaviour without needing to patch them. Similarly, a process level opt-out also seems good enough if we were to have it on by default.

    vstinner commented 3 years ago

    In this case, having it off by default goes further to prevent breakage

    PyYAML was unsafe by default: it allowed to execute arbitary Python code by default. It took years to change the default to "safe". I don't think that adding a parameter for opt-in for security is a good approach. An application can use ipaddress internally without being aware of using it, if it's done by a third party module. It's hard to prevent security vulnerabilities if people have to "opt-in" for security.

    I prefer to break code and force people to manually get back the old behavior. It's better to make 90% safe by default but make 10% of people unhappy.

    It's uncommon to pass IPv4 addresses with leading zeros.

    If you want to tolerate leading zeros, you don't have to modify the ipaddress for that, you can pre-process your inputs: it works on any Python version with or without the fix.

    >>> def reformat_ip(address): return '.'.join(part.lstrip('0') if part != '0' else part for part in address.split('.'))
    ... 
    >>> reformat_ip('0127.0.0.1')
    '127.0.0.1'

    Or with an explicit loop for readability:

    def reformat_ip(address):
        parts = []
        for part in address.split('.'):
            if part != "0":
                part = part.lstrip('0')
            parts.append(part)
        return '.'.join(parts)
    zooba commented 3 years ago

    I don't think that adding a parameter for opt-in for security is a good approach.

    I meant to have it set by default on 3.10, when we do not have to worry about breaking users.

    If it takes years for users to get to 3.10, we should reevaluate our release cycle, not whether we aggressively break maintenance releases.

    vstinner commented 3 years ago

    The CVE-2021-29921 was assigned to this vulnerability.

    c139cf46-fb7c-4c62-96da-220de2afbec1 commented 3 years ago

    If it takes years for users to get to 3.10, we should reevaluate our release cycle, not whether we aggressively break maintenance releases.

    I don't really understand how that would help. The problem is that users have major inertia for switching to newer Python versions. A part of it is that a lot of people just don't care about deprecation warnings, and don't fix stuff until it's actually broken. In the end, your projects are blocked from using new major Python version by broken dependencies with long release cycles.

    I can't imagine deliberately leaving 3.8 and 3.9 vulnerable when 3.10 isn't going to reach final release in the next half year. Gentoo stable is only switching to 3.9 next month. I'm pretty sure some of our (few) corporate users are still on 3.7 or earlier. Then, there are projects that literally include a vulnerable copy of Python 2.7 to get around distributions removing it.

    I dare say this has less breakage potential than the &/; change. It should be fixed on all affected versions. If you don't do that, distributions will have to patch it anyway, and this will only lead to incompatibility between different Python package vendors.

    ambv commented 3 years ago

    Due to the relative obscurity of the bug and potential disruption of the fix, I decided not to include it in 3.8.

    However, Michał's argument about 3.10 not being released for another five months is resonating with me and so we will be backporting the change to 3.9.5, to be released tomorrow. Victor's argument about opt-ins being a bad way to fix security also makes sense, although let me point out that we've made decisions the other way in the past as well, for instance with hash randomization.

    In any case, the issue will be solved in Python 3.10.0 Beta 1 and Python 3.9.5. Having the fixed behavior "in 3.9.5 and newer" makes for easy mechanical checks whether a given version is affected.

    ambv commented 3 years ago

    New changeset 60ce8f0be6354ad565393ab449d8de5d713f35bc by Christian Heimes in branch 'master': bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099) https://github.com/python/cpython/commit/60ce8f0be6354ad565393ab449d8de5d713f35bc

    ambv commented 3 years ago

    New changeset 5374fbc31446364bf5f12e5ab88c5493c35eaf04 by Miss Islington (bot) in branch '3.9': bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099) (GH-25815) https://github.com/python/cpython/commit/5374fbc31446364bf5f12e5ab88c5493c35eaf04

    tiran commented 3 years ago

    Łukasz, thanks for pushing the PR over the finish line!

    ned-deily commented 3 years ago

    Is there anything more to be done for this issue or can it be closed?

    pablogsal commented 3 years ago

    I'm closing this, if someone thinks something is missing, please, reopen

    vstinner commented 3 years ago

    I created https://python-security.readthedocs.io/vuln/ipaddress-ipv4-leading-zeros.html to track this vulnerability.

    Python 3.8 is left unchanged (accept leading zeros). Python 3.7 and older are not affected.

    eaf08195-02b1-4629-aae0-97d158c7f815 commented 3 years ago

    The timeline there is wrong. This issue's creation time isn't the disclosure time, it's when the bug was introduced. The disclosure was on 30th of May, when I emailed security@python.org and Christian Heimes commented here and made https://github.com/python/cpython/pull/25099. Even though Serhiy Storchaka commented that this could be a security issue back when the issue was new, the date would be 30th of March 2019, not 20th.

    vstinner commented 3 years ago

    George-Cristian Bîrzan: "The timeline there is wrong."

    Fixed: https://python-security.readthedocs.io/vuln/ipaddress-ipv4-leading-zeros.html#timeline

    The strange part is "2019-03-20 (-741 days): Python issue bpo-36384 reported by Joel Croteau".

    The problem is that this issue was "reused" for two different things: the initial change and the vulnerability.

    Maybe I can removed the reference to the bpo to remove it from the timeline (and put it in links).

    eaf08195-02b1-4629-aae0-97d158c7f815 commented 3 years ago

    I think the only thing I'd improve would be to mention that this issue is the one that introduced the bug, otherwise it looks a bit weird.

    vstinner commented 3 years ago

    I think the only thing I'd improve would be to mention that this issue is the one that introduced the bug, otherwise it looks a bit weird.

    Ok, done: https://python-security.readthedocs.io/vuln/ipaddress-ipv4-leading-zeros.html#timeline

    b091a86f-23d6-494b-bd32-92bbf7528181 commented 3 years ago

    Can we backport the security fix from this issue https://bugs.python.org/issue36384#msg392684 to version 3.8 The comment explicitly says that it was decided to not include in 3.8, I am not sure this is best, since it prevents using 3.8 because of this open vulnerability, and it does not seem to be a breaking change or too hard to port.

    ambv commented 3 years ago

    it prevents using 3.8 because of this open vulnerability

    What do you mean by this?

    Our understanding is that this is a low-severity CVE because in order for this to be a vulnerability, you'd have to have both:

    1. user access to IP address input; and
    2. control over two addresses sharing numerical representation with leading zeroes: the first resolving when leading zeroes are treated as octal numbers; the second resolving when leading zeroes are treated as decimal numbers.

    Access to both then allows you at best to circumvent IP address-based access control or denial of service. However, access to just 1. allows you to input any IP address to achieve the same goals.

    Hence low-severity.

    it does not seem to be a breaking change

    It is a bona fide breaking change. Any IP address configuration saved in files or databases which might have used leading zeroes would be rejected by 3.8.12. The same was true for 3.9.5 but since this release series has much higher exposure (still receiving binary installers and regular-cadence bugfixes), it was less controversial to include it.

    If you still feel this ought to be fixed in 3.8, please elaborate.

    b091a86f-23d6-494b-bd32-92bbf7528181 commented 3 years ago

    > it prevents using 3.8 because of this open vulnerability

    What do you mean by this?

    Our understanding is that this is a low-severity CVE because in order for this to be a vulnerability, you'd have to have both:

    1. user access to IP address input; and
    2. control over two addresses sharing numerical representation with leading zeroes: the first resolving when leading zeroes are treated as octal numbers; the second resolving when leading zeroes are treated as decimal numbers.

    Access to both then allows you at best to circumvent IP address-based access control or denial of service. However, access to just 1. allows you to input any IP address to achieve the same goals.

    Hence low-severity.

    Even though I agree with you assessment on the root cause of the issue itself, it is listed as critical in https://nvd.nist.gov/vuln/detail/CVE-2021-29921, which means most commercial scan tools will also flag python 3.8 as critical, and this could prevent users from going with python 3.8 on production. (our case too)

    > it does not seem to be a breaking change

    It is a bona fide breaking change. Any IP address configuration saved in files or databases which might have used leading zeroes would be rejected by 3.8.12. The same was true for 3.9.5 but since this release series has much higher exposure (still receiving binary installers and regular-cadence bugfixes), it was less controversial to include it.

    If you still feel this ought to be fixed in 3.8, please elaborate.

    IMHO I still think this should be solved in 3.8, otherwise there is really no other alternative but to upgrade to python 3.9 which is a hassle, since all 3.8.x are "critically vulnerable", had the CVE in https://nvd.nist.gov/vuln/detail/CVE-2021-29921 not been marked as critical, then we could have used python 3.8 knowing the two conditions you mentioned earlier.

    ambv commented 3 years ago

    I was unaware of the "CRITICAL" base score assigned by NIST to this. Alright, let's port this back then. There are a few things the PR will need.

    tiran commented 3 years ago

    "CRITICAL" is a ridiculous high assessment for this bug. Somebody ticked all the scary boxes in the CVSS form like "total loss of control".

    tiran commented 3 years ago

    The CVE was rated https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H&version=3.1, which is equivalent to a RCE with authentication bypass.

    I would rate the issue https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:L/A:N&version=3.1, maybe A:L.

    ambv commented 3 years ago

    New changeset 03dd89d62413c4a92831ed1b36e2ae8983bcb2d4 by achraf-mer in branch '3.8': [3.8] bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099) (GH-27801) https://github.com/python/cpython/commit/03dd89d62413c4a92831ed1b36e2ae8983bcb2d4

    ambv commented 3 years ago

    New changeset 6ebfe8da6331bfcf54057f6e22a6f353a5621d35 by Łukasz Langa in branch '3.8': [3.8] bpo-36384: [doc] Correct typos in CVE-2021-29921 fix description (GH-27825) https://github.com/python/cpython/commit/6ebfe8da6331bfcf54057f6e22a6f353a5621d35

    ambv commented 3 years ago

    New changeset 0fd66e46b2f472d0d206a185dc8892f4f0347cb6 by Łukasz Langa in branch 'main': bpo-36384: [doc] Mention CVE-2021-29921 fix in 3.8.12 (GH-27824) https://github.com/python/cpython/commit/0fd66e46b2f472d0d206a185dc8892f4f0347cb6

    miss-islington commented 3 years ago

    New changeset 1204dfc89cb3ed5e21dce32aed0339b7569fe1f9 by Miss Islington (bot) in branch '3.10': bpo-36384: [doc] Mention CVE-2021-29921 fix in 3.8.12 (GH-27824) https://github.com/python/cpython/commit/1204dfc89cb3ed5e21dce32aed0339b7569fe1f9

    ambv commented 3 years ago

    New changeset 98820250a3c9c131d3c2d57c4fc5260aebd8aa1d by Miss Islington (bot) in branch '3.9': bpo-36384: [doc] Mention CVE-2021-29921 fix in 3.8.12 (GH-27824) (GH-27827) https://github.com/python/cpython/commit/98820250a3c9c131d3c2d57c4fc5260aebd8aa1d