python / cpython

The Python programming language
https://www.python.org
Other
63.8k stars 30.55k forks source link

Inefficient ssl.SSLWantReadError exception slows down very common use-case #123954

Open tarasko opened 2 months ago

tarasko commented 2 months ago

Bug report

Bug description:

Event loops like uvloop, asyncio use nonblocking ssl. They typically

  1. read data from the socket when epoll returns that it is ready
  2. push data to the incoming MemoryBIO
  3. read from SSLObject until SSLWantReadError is thrown
  4. pass read data to the application protocol

when peers are exchanging relatively small messages, SSLObject.read is typically called 2 times . First call returns data, second - throws SSLWantReadError

perf shows that the second call is almost as expensive as the first call because of time spent on constructing new exception object.

Is it possible to optimize exception object creation for the second call?

I tried to avoid the second call by analyzing MemoryBIO.pending and SSLObject.pending values but they can't always reliably tell that we have to wait for more data.

For example, it is possible that incoming MemoryBIO.pending > 0, SSLObject.pending == 0. We call SSLObject.read and it throws because incoming MemoryBIO doesn't have the full ssl frame yet.

Example echo client that replicates internal logic in asyncio/uvloop:

import socket
import ssl
import select
from typing import Optional

ssl_context = ssl.create_default_context()
ssl_context.check_hostname = False
ssl_context.verify_mode = ssl.CERT_NONE

ep = select.epoll(2)

incoming = ssl.MemoryBIO()
outgoing = ssl.MemoryBIO()

sock: Optional[socket.socket] = None
ssl_sock: Optional[ssl.SSLObject] = None

def wait_data():
    ep.poll()

    try:
        while True:
            chunk = sock.recv(1024)
            incoming.write(chunk)
    except BlockingIOError:
        pass

def wait_data_until_ssl_read_succeed():
    data = bytearray()
    while True:
        try:
            wait_data()
            while True:
                data += ssl_sock.read()
        except ssl.SSLWantReadError as ex:
            # print(ex)
            if data:
                return data

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
    ep.register(sock.fileno(), select.EPOLLIN)

    ssl_sock = ssl_context.wrap_bio(incoming, outgoing, server_hostname='localhost')

    sock.connect(('127.0.0.1', 25000))
    sock.setblocking(False)

    handshake_complete = False
    message_sent = False

    msg = b"a" * 256

    # do handshake
    while True:
        try:
            ssl_sock.do_handshake()
            break
        except ssl.SSLWantReadError as ex:
            if outgoing.pending > 0:
                chunk = outgoing.read(outgoing.pending)
                sock.send(chunk)
            wait_data()

    # send message and wait for reply
    while True:
        ssl_sock.write(msg)
        chunk = outgoing.read(outgoing.pending)
        sock.send(chunk)

        data = wait_data_until_ssl_read_succeed()
        # print(data)

Perf output:

   17.41%     0.25%            43  python   _ssl.cpython-314-x86_64-linux-gnu.so     [.] _ssl__SSLSocket_read
            |          
             --17.16%--_ssl__SSLSocket_read
                       |          
                       |--8.09%--SSL_read_ex
                       |          |          
                       |           --7.74%--0x7b1fbef883f9
                       |                     |          
                       |                     |--4.83%--0x7b1fbefadc22
                       |                     |          |          
                       |                     |          |--0.93%--0x7b1fbefa6919
                       |                     |          |          |          
                       |                     |          |           --0.90%--EVP_DecryptUpdate
                       |                     |          |                     |          
                       |                     |          |                      --0.90%--0x7b1fbec90c8b
                       |                     |          |                                |          
                       |                     |          |                                 --0.87%--0x7b1fbec90b45
                       |                     |          |          
                       |                     |          |--0.79%--0x7b1fbefa64a5
                       |                     |          |          |          
                       |                     |          |           --0.61%--EVP_CIPHER_CTX_get_iv_length
                       |                     |          |          
                       |                     |          |--0.72%--0x7b1fbefa6721
                       |                     |          |          |          
                       |                     |          |           --0.60%--EVP_CipherInit_ex
                       |                     |          |          
                       |                     |          |--0.57%--0x7b1fbefa6750
                       |                     |          |          
                       |                     |           --0.52%--0x7b1fbefa68f0
                       |                     |          
                       |                     |--1.13%--0x7b1fbefadf94
                       |                     |          |          
                       |                     |           --0.52%--0x7b1fbefac3c7
                       |                     |          
                       |                      --0.68%--0x7b1fbefad750
                       |          
                       |--6.21%--PySSL_SetError.constprop.0
                       |          |          
                       |           --5.17%--fill_and_set_sslerror
                       |                     |          
                       |                     |--2.83%--PyUnicode_FromFormat
                       |                     |          |          
                       |                     |           --2.54%--unicode_from_format
                       |                     |                     |          
                       |                     |                      --1.23%--__sprintf_chk
                       |                     |                                __vsprintf_internal
                       |                     |                                |          
                       |                     |                                 --0.99%--__vfprintf_internal
                       |                     |          
                       |                      --1.01%--PyObject_SetAttr
                       |                                |          
                       |                                 --0.98%--PyObject_GenericSetAttr
                       |                                           |          
                       |                                            --0.52%--_PyObjectDict_SetItem
                       |          
                        --0.60%--SSL_get_error

To reproduce you would need some ssl echo server running on localhost 25000 port. After you have started it, run echo client code under perf.

$ perf record -F 999 -g --call-graph lbr --user-callchains -- python echo_client.py
$ perf report -G -n --stdio

Let it work for 15 seconds and then press Ctrl-C

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

NoneTypeCoder commented 1 month ago

How do you get this perf result? I want to learn about this, how to do perf for my code, thanks for your answer.

tarasko commented 1 month ago

How do you get this perf result? I want to learn about this, how to do perf for my code, thanks for your answer.

I have updated description with instruction on how to run perf