wbarnha / kafka-python-ng

Fork for Python client for Apache Kafka
https://wbarnha.github.io/kafka-python-ng/
Apache License 2.0
42 stars 4 forks source link

Handle OSError to properly recycle SSL connection, fix infinite loop #155

Closed wbarnha closed 3 months ago

wbarnha commented 3 months ago

Here's a stack trace we had our logs flooded with.

[07/15/2020 08:51:14.799: ERROR/kafka.producer.sender] Uncaught error in kafka producer I/O thread Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/kafka/producer/sender.py", line 60, in run self.run_once() File "/usr/local/lib/python3.6/site-packages/kafka/producer/sender.py", line 160, in run_once self._client.poll(timeout_ms=poll_timeout_ms) File "/usr/local/lib/python3.6/site-packages/kafka/client_async.py", line 580, in poll self._maybe_connect(node_id) File "/usr/local/lib/python3.6/site-packages/kafka/client_async.py", line 390, in _maybe_connect conn.connect() File "/usr/local/lib/python3.6/site-packages/kafka/conn.py", line 426, in connect if self._try_handshake(): File "/usr/local/lib/python3.6/site-packages/kafka/conn.py", line 505, in _try_handshake self._sock.do_handshake() File "/usr/local/lib/python3.6/ssl.py", line 1077, in do_handshake self._sslobj.do_handshake() File "/usr/local/lib/python3.6/ssl.py", line 689, in do_handshake self._sslobj.do_handshake() OSError: [Errno 0] Error

The problem is Python 3.6 is returning OSError, which is not expected. Such exception is propagated to the caller and code making recycling of such connection is not executed. Therefore, Producer is guaranteed to get the same exception on a next call to poll().

Throwing of OSError doesn't seem to be documented even in latest Python docs. See 3.8 docs, but there are signs of it in 3.8 source code.


This change is Reviewable