python / cpython

The Python programming language
https://www.python.org
Other
63.06k stars 30.2k forks source link

DoS Vulnerability in socket.create_connection through malicious DNS responses #106283

Open NyanKiyoshi opened 1 year ago

NyanKiyoshi commented 1 year ago

Bug report

This issue was initially reported to Python Security Team, and they decided to make the issue public.

socket.create_connection(address, timeout[, ...]) function can be abused to carry-out DoS attacks if the address hostname resolves to many un-routable IP addresses.

The issue comes from the timeout parameter being wrongfully the same for each IP address that it needs to try to connect to. This results to the function to potentially not respect the provided timeout value (CPython code).

For example if we were to have a hostname resolving to 20 un-routable IPs, with timeout=2 it would take 40 seconds instead of two seconds.

This was introduced in Python 2.6, and affects all versions released since then.

PoC

Attached is a minimal reproducible example with extra-logging to help understand what is Python doing.

In the example:

Python Code ```python import socket, sys, logging # The number of IP addresses to return when resolving the hostname. IP_COUNT = 10 # Extra logging. def audit_hook(event, args): if event == "socket.connect": logging.warning(f"{event}: {args[-1]}") logging.basicConfig(format="%(asctime)s %(levelname)s: %(message)s") sys.addaudithook(audit_hook) # Fake DNS resolving to make it easier to reproduce. def getaddrinfo(*_args, **_kwargs): return [(socket.AF_INET, socket.SOCK_STREAM, 6, "", ("10.0.0.0", 80))] * IP_COUNT socket.getaddrinfo = getaddrinfo # Vulnerable code: socket.create_connection(("dos.test", 80), timeout=2) ```

This results to Python trying each of the 10 IP addresses with a connect-timeout of two seconds for each of them. Instead of socket.create_connection() taking only two seconds to give-up, it takes 20 seconds as shown below:

Output (Python 3.10.6) ``` # time python3 poc.py 2023-06-26 08:40:20,971 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:22,976 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:24,980 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:26,986 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:28,990 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:30,993 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:32,996 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:35,001 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:37,004 WARNING: socket.connect: ('10.0.0.0', 80) 2023-06-26 08:40:39,011 WARNING: socket.connect: ('10.0.0.0', 80) Traceback (most recent call last): File "//poc.py", line 18, in socket.create_connection(("dos.test", 80), timeout=2) File "/usr/lib/python3.10/socket.py", line 845, in create_connection raise err File "/usr/lib/python3.10/socket.py", line 833, in create_connection sock.connect(sa) TimeoutError: timed out real 0m20.082s user 0m0.034s sys 0m0.011s ```

It was successfully tested against 500 IP records on a real production DNS server, in the above example it would take 16 minutes for it to give up instead of two seconds.

This number of records could be set much higher.

If needed, a "real" DNS server can be used through another PoC, which uses bind9: https://github.com/NyanKiyoshi/poc-cpython-gh-106283#usage.

Affected

Python >= 2.6 (including 3.x ones)

Impact: any component using socket.create_connection() accepting arbitrary hostnames could hang "forever" (such as urllib, smtplib, etc.).

Proposed Fix

Golang is a good example on how the issue could be resolved, it follows RFC 8305 where it will share equally the connect-timeout value among all the IP addresses it has to try.

Key points (see net.sysDialer.dialSerial and net.partialDeadline source code):

For example when giving Go v1.20.5 a timeout of 10 seconds the following behavior happens:

$ ./go-example --timeout 10s
17:20:26.445149 Found 256 IPs to try out
17:20:26.445282 Attempting to connect to 10.0.2.21:80...
17:20:28.446960 Attempting to connect to 10.0.2.25:80...
17:20:30.449034 Attempting to connect to 10.0.2.67:80...
17:20:32.450936 Attempting to connect to 10.0.2.141:80...
17:20:34.452858 Attempting to connect to 10.0.2.13:80...
17:20:36.443316 Failed to get URL (in 10.004180651s): dial tcp 10.0.2.230:80: i/o timeout <- it exited within 10s as we requested

Reproducible via https://github.com/NyanKiyoshi/poc-cpython-gh-106283#example-non-vulnerable-implementation.

Linked PRs

gpshead commented 1 year ago

(pasting an older reply from the PSRT list):

The main thing I think we need to do from a CPython perspective:

  1. Document that people should always specify a timeout. That's an individual application level mitigation.
  2. Bugfix: make sure our timeout gets spread across the entire returned list of hosts (with a minimum per host timeout to avoid making that its own DoS, Golang chose a 2 sec/host minimum).
  3. Consider if there is a definition of "reasonable limit" for the number of addresses returned from a gethostbyname to limit our use of. 500 addresses is clearly on silly side of huge.

Picking a default timeout value when unspecified would make sense only if this were deemed a serious issue. That could break existing code. A slow network DoS usually isn't that level of serious.

Even the bug fix to spread our timeout across the entire list could be disruptive to existing code if it assumes the existing behavior and expects it to be per-host. Which is why back-porting that kind of API change to old releases may be questionable for us.

NyanKiyoshi commented 1 year ago

Consider if there is a definition of "reasonable limit" for the number of addresses returned from a gethostbyname to limit our use of. 500 addresses is clearly on silly side of huge.

To add to this point: I have been looking into whether there is a limit of A records a DNS query could receive. The maximum number of IPs I was able to return was 4095 A records (TCP + message compression feature from RFC 1035).

This would be a big problem for timeout=None (default value), as it's up to the system to decide when to give up per IP address (60s/IP when testing on Ubuntu 22.04, thus process would hang for 68 hours).

Picking a default timeout value when unspecified would make sense only if this were deemed a serious issue. That could break existing code. A slow network DoS usually isn't that level of serious.

We could also consider adding sys.set_default_connect_timeout() to allow to change None default to what the application deems an acceptable default (slightly similar to CVE-2020-10735 which added sys.set_int_max_str_digits()). This would give a safe-guard for applications that may forget to set non-null timeout while accepting arbitrary domains.

serhiy-storchaka commented 8 months ago

More crazy idea:

We still need to split addresses on batches of limited size, because the number of open file descriptors is limited and because we do not want to make the Python application the source of a DDOS attack, but this will improve responsibility in more common cases: allow you to get a result faster in case the first candidate is not available, and to succeed on a slow network with less timeout.

gpshead commented 8 months ago

A "try to create a few connections at once" concept reminds me of the https://en.wikipedia.org/wiki/Happy_Eyeballs algorithm. Which was basically put in place for user facing latency sensitive areas to pick the connection that works without waiting for a timeout. Specifically to pick IPv6 vs IPv4 so that v6 can be preferred when it appears to be working without consequences.

I'm generally wary of doing that kind of this automatically in a low level socket API by default. It is common for services to list multiple addresses in DNS, but that doesn't mean they want the load of multiple short lived connection attempts from everybody who's trying to use them.

I'm not aware of people doing this for addresses of the same type (I'm sure someone does). It is generally assumed that all addresses of one protocol are all routeable, if any are, and they all go to high availability frontends so trying multiple at once would be optimizing to avoid most services least likely situation of the first one a client chose not responding.