pyca / pyopenssl

A Python wrapper around the OpenSSL library
https://pyopenssl.org/
Apache License 2.0
887 stars 422 forks source link

Setting socket timeouts breaks handshake #168

Open viraptor opened 9 years ago

viraptor commented 9 years ago

I'm seeing a weird issue when starting a TLS connection to any host. If I don't set any timeout on the socket, it works fine. If I do, it breaks before the handshake with a OpenSSL.SSL.WantReadError. For example if I set timeout to 100, it will break after a second anyway.

For now I use a workaround of setting a timeout on connection, but then removing it before the handshake.

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(2)

ctx = OpenSSL.SSL.Context(OpenSSL.SSL.TLSv1_METHOD)
ctx.set_options(OpenSSL.SSL.OP_NO_SSLv2 | OpenSSL.SSL.OP_NO_SSLv3)
ctx.set_verify(OpenSSL.SSL.VERIFY_NONE, lambda _a, _b, _c, _d, _e: None)
conn = OpenSSL.SSL.Connection(ctx, s)
conn.set_tlsext_host_name(hostname.encode('utf-8'))
conn.connect((ip, port))

# s.settimeout(None)  # the workaround

try:
    conn.do_handshake()
except OpenSSL.SSL.WantReadError:
    # this happens on every connection

I'm running on Python 3.4.1, OpenSSL 0.14.

exarkun commented 9 years ago

Can you produce a self-contained example of this behavior? My simple attempts produce this contrary result:

 python whatwhat.py 
Traceback (most recent call last):
  File "whatwhat.py", line 15, in <module>
    conn.connect((ip, port))
  File "/tmp/it/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1104, in connect
    return self._socket.connect(addr)
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.timeout: timed out
viraptor commented 9 years ago
import socket
import OpenSSL
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(100)
ctx = OpenSSL.SSL.Context(OpenSSL.SSL.TLSv1_METHOD)
conn = OpenSSL.SSL.Connection(ctx, s)
conn.connect(("192.30.252.128", 443))
try:
    conn.do_handshake()
except OpenSSL.SSL.WantReadError:
    print("badness")

Prints out "badness" for me. I used the ip of "github.com". After commenting out s.settimeout(100) line, it works.

Runtime takes 1 second, not 100.

$ time python3 tst.py
badness
python3 tst.py  1.01s user 0.01s system 78% cpu 1.293 total
exarkun commented 9 years ago

Thanks.

From this, I see that the problem is the mismatch between expectations for do_handshake vs the native Python socket APIs.

Python's socket.socket.connect method respects the Python-level socket timeout. This is easy for it because they're both native Python socket features (timeouts and connection setup).

OpenSSL.SSL.Connection.do_handshake is not a native Python socket operation though. It is a call into OpenSSL's SSL_do_handshake API which operates on the actual (platform-level) socket directly. Python's timeout support puts that socket into non-blocking mode. OpenSSL's SSL_do_handshake encounters this and does the standard OpenSSL-level thing - translate the "EWOULDBLOCK" read error into an OpenSSL WantReadError (that's pyOpenSSL's spelling of the error but that's easier to talk about here). pyOpenSSL raises this exception up to the caller of do_handshake.

The only idea I have for fixing this is to teach pyOpenSSL about Python's socket timeout feature: at every point in the API where there is an OpenSSL operation that operates directly on a platform-level socket, introduce the same kind of wait-and-retry logic that Python's own socket library has (which implements the timeout feature).

This will introduce a lot of complexity into pyOpenSSL. That's not a sufficient reason to say it would be a bad thing to add this functionality to pyOpenSSL but it does hint that it might not be a good idea. The contrary argument might be that any application that wants to use timeout-enabled sockets (or more generally, non-blocking sockets) with pyOpenSSL will need to implement this logic (and indeed they have, that's why pyOpenSSL exposes these error conditions as exceptions in the first place).

The general shape of this solution is probably to make all methods that invoke an OpenSSL API that interacts with a socket (as opposed to methods that interact with the Python-level socket, eg OpenSSL.SSL.Connection.connect) have some code kind of like this:

    timeout = self._socket.gettimeout()
    if timeout is not None:
        start = time()
    while True:
        try:
            return <some OpenSSL API>
        except (WantReadError, WantWriteError):
            if timeout is None or start + timeout > time():
                raise
            select([self._socket], [self._socket], [], timeout - (time() - start))
            if <select timed out>:
                raise <something - the original exception?  a specific timeout exception?>

This could probably be encapsulated into a helper function. It would probably also be beneficial to inspect the implementation of timeouts in Python's sockets to see if there are any other behaviors worth emulating or non-obvious implementation concerns worth dealing with here (to provide the least surprising behavior).

andresriancho commented 9 years ago

While this gets implemented is there any workaround to get timeouts to work?

webratz commented 7 years ago

Has this ever been fixed? Or is there a proper workaround?

exarkun commented 7 years ago

This hasn't been fixed, so far as I am aware. Anyone who is using pyOpenSSL via Twisted gets properly working timeouts - and that's the only way I use pyOpenSSL. So, that's one possible work-around.

webratz commented 7 years ago

I use it directly only to determine information about the used certificate so using twisted is a bit of overkill

moospit commented 7 years ago

Same here. I just need to get certificate information. Using additional libraries would be overkill.

tiran commented 7 years ago

Python implements timeout on top of standard socket IO with select() or poll(). In order to implement timeout on top of OpenSSL, you have to re-implement Python's implementation with select(), WantReadError and WantWriteError like explained in https://github.com/pyca/pyopenssl/issues/168#issuecomment-61813592 .

tiran commented 7 years ago

The timeout magic happens in sock_call_ex and internal_select https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L765

brandond commented 7 years ago

@moospit @webratz Here's what works for me:

def print_chain(context, hostname):
    print('Connecting to {0}'.format(hostname))
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock = SSL.Connection(context=context, socket=sock)
    sock.settimeout(5)
    sock.connect((hostname, 443))
    sock.setblocking(1)
    sock.do_handshake()
    for cert in sock.get_peer_cert_chain():
        print('   s:{0}'.format(cert.get_subject()))
        print('   i:{0}'.format(cert.get_issuer()))
    sock.shutdown()
    sock.close()

You can put a timeout on the connect and it will work as desired, you just have to put the socket back into blocking mode before calling into OpenSSL. Of course this just gets you a timeout on the TCP connection; if things stall during the SSL handshake you're still going to be left hanging but it's better than nothing.

webratz commented 7 years ago

yeah in my case i sometimes connect to weird servers that happily open the tcp connection, but then do not properly respond on the SSL layer. My workaround was using the timeout-decorator from pypi

hmahadik commented 5 years ago

I was able to work around this by doing a select before calling do_handshake:

readable, writable, errored = select.select([self._sock], [], [], 10)
if self._sock in readable:
    module_logger.debug("socket in readable")
self._sock.do_handshake()

setblocking(1) didn't work for me so I gave select a shot and it does work.

Achelics commented 5 years ago

This hasn't been fixed, so far as I am aware. setblocking(1) didn't work for me! And When I don't set the socket timeout, and do_handshake will bloking; when I set the socket timeout, and the do_handshake will show the error of OpenSSL.SSL.WantReadError.

earonesty commented 4 years ago

Note: In order to conform to the documentation on socket timeouts they must get reset each time a read is made. So there's no reason for a "very complex" solution., just a moderately complex one:

    timeout = self._socket.gettimeout()    
    while True:
        try:
            return <some OpenSSL API>
        except (WantReadError, WantWriteError):
            select([self._socket], [self._socket], [], timeout)
            if <select timed out>:
                raise <timeout exception>
vincentrussell commented 4 years ago

I found this code here which worked great for me: https://gemfury.com/hemamaps/python:urllib3/-/content/contrib/pyopenssl.py

cnx = OpenSSL.SSL.Connection(ctx, sock)
if isinstance(server_hostname, six.text_type):  # Platform-specific: Python 3
    server_hostname = server_hostname.encode('utf-8')
cnx.set_tlsext_host_name(server_hostname)
cnx.set_connect_state()
while True:
    try:
        cnx.do_handshake()
    except OpenSSL.SSL.WantReadError:
        rd, _, _ = select.select([sock], [], [], sock.gettimeout())
        if not rd:
            raise timeout('select timed out')
        continue
    except OpenSSL.SSL.Error as e:
        raise ssl.SSLError('bad handshake: %r' % e)
    break
milahu commented 4 months ago

My workaround was using the timeout-decorator from pypi

timeout-decorator fails with multithreading

yepp! [timeout-decorator](https://github.com/pnpnpn/timeout-decorator) works : ) used in https://github.com/danilobellini/aia/pull/3 the ssl server is stopped with `https_server_process.stop()` ```py # @timeout_decorator.timeout(timeout, timeout_exception=TimeoutError, use_signals=False) @timeout_decorator.timeout(timeout, timeout_exception=TimeoutError) def do_handshake(): conn.do_handshake() #conn.do_handshake() do_handshake() cert_chain = conn.get_peer_cert_chain() assert cert_chain != None ```
do_handshake.py ```py import timeout_decorator import OpenSSL # pyopenssl import certifi timeout = 5 host, port = "127.0.0.1", 4430 ssl_context = OpenSSL.SSL.Context(method=OpenSSL.SSL.TLS_CLIENT_METHOD) ssl_context.load_verify_locations(cafile=certifi.where()) conn = OpenSSL.SSL.Connection( ssl_context, socket=socket.socket(socket.AF_INET, socket.SOCK_STREAM) ) conn.settimeout(timeout) conn.connect((host, port)) conn.setblocking(1) conn.set_tlsext_host_name(host.encode()) @timeout_decorator.timeout(timeout, timeout_exception=TimeoutError) def do_handshake(): conn.do_handshake() #conn.do_handshake() do_handshake() cert_chain = conn.get_peer_cert_chain() assert cert_chain != None ```
edit: nope. [timeout-decorator](https://github.com/pnpnpn/timeout-decorator) is limited to the main thread ``` File "aia.py", line 212, in get_host_cert_chain do_handshake() File "/lib/python3.11/site-packages/timeout_decorator/timeout_decorator.py", line 75, in new_function old = signal.signal(signal.SIGALRM, handler) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/lib/python3.11/signal.py", line 58, in signal handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: signal only works in main thread of the main interpreter ``` [timeout-decorator#multithreading](https://github.com/pnpnpn/timeout-decorator#multithreading)
timeout-decorator#multithreading
Multithreading -------------- By default, timeout-decorator uses signals to limit the execution time of the given function. This appoach does not work if your function is executed not in a main thread (for example if it's a worker thread of the web application). There is alternative timeout strategy for this case - by using multiprocessing. To use it, just pass ``use_signals=False`` to the timeout decorator function: ```py import time import timeout_decorator @timeout_decorator.timeout(5, use_signals=False) def mytest(): print "Start" for i in range(1,10): time.sleep(1) print("{} seconds have passed".format(i)) if __name__ == '__main__': mytest() ``` Warning: Make sure that in case of multiprocessing strategy for timeout, your function does not return objects which cannot be pickled, otherwise it will fail at marshalling it between master and child processes.
`timeout_decorator` with `use_signals=False` does not work with `conn.do_handshake` because after `do_handshake`, `cert_chain` is `None` similar issue: read with timeout https://github.com/milahu/gnumake-tokenpool/issues/10
milahu commented 4 months ago

https://github.com/pyca/pyopenssl/issues/168#issuecomment-61813592

introduce the same kind of wait-and-retry logic that Python's own socket library has (which implements the timeout feature).

yepp! this works, see test

removed select for a writable sock, because sock is always writable

-  select.select([sock], [sock], [], remain)
+  select.select([sock], [], [], remain)
#!/usr/bin/env python3

import sys
import time
import select
import socket
from urllib.parse import urlsplit

import OpenSSL # pyopenssl
import certifi

cafile = certifi.where()

def get_cert_chain(hostname, port, timeout=5):

    # https://github.com/pyca/pyopenssl/issues/168#issuecomment-61813592
    # exarkun commented on Nov 5, 2014

    ssl_context = OpenSSL.SSL.Context(method=OpenSSL.SSL.TLS_CLIENT_METHOD)
    ssl_context.load_verify_locations(cafile=cafile)
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    conn = OpenSSL.SSL.Connection(context=ssl_context, socket=sock)
    #sock.settimeout(5) # no. sock.gettimeout() still returns None
    conn.settimeout(5) # is this needed?
    conn.connect((hostname, port))
    conn.setblocking(1)

    #conn.do_handshake()

    def do_handshake():
        conn.setblocking(0) # unblock conn.do_handshake
        #timeout = sock.gettimeout() # None
        #timeout = conn.gettimeout() # None
        timeout = 5
        #print("timeout", timeout)
        if timeout is not None:
            start = time.time()
        last_remain = timeout
        while True:
            try:
                #return <some OpenSSL API>
                #print("conn.do_handshake ...")
                res = conn.do_handshake()
                print("conn.do_handshake ok")
                conn.setblocking(1)
                return res
            except (OpenSSL.SSL.WantReadError, OpenSSL.SSL.WantWriteError) as exc:
                #print("exc", exc)
                remain = timeout - (time.time() - start)
                t_step = last_remain - remain # 0.0004
                last_remain = remain
                #print("remain", remain)
                #print("t_step", t_step)
                #if timeout is None or start + timeout > time.time():
                if remain < 0:
                    #raise
                    conn.setblocking(1)
                    raise TimeoutError
                # TODO? handle timeout from select
                readable, writable, errored = select.select(
                    # no. dont select writable sock
                    #[sock], [sock], [], remain
                    [sock], [], [], remain
                )
                print("select", (readable, writable, errored))
                #if <select timed out>:
                #    raise <something - the original exception?  a specific timeout exception?>
                # no. this was only needed with select writable sock
                # because the sock is always writable
                #time.sleep(0.5) # reduce cpu load

    do_handshake()

    cert_chain = conn.get_peer_cert_chain()
    conn.shutdown()
    conn.close()
    return cert_chain

hostname = sys.argv[1]
port = int(sys.argv[2])
try:
    get_cert_chain = get_cert_chain(hostname, port)
except TimeoutError:
    print("timeout")