Here's a stack trace we had our logs flooded with.
[07/15/2020 08:51:14.799: ERROR/kafka.producer.sender] Uncaught error in kafka producer I/O thread
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kafka/producer/sender.py", line 60, in run
self.run_once()
File "/usr/local/lib/python3.6/site-packages/kafka/producer/sender.py", line 160, in run_once
self._client.poll(timeout_ms=poll_timeout_ms)
File "/usr/local/lib/python3.6/site-packages/kafka/client_async.py", line 580, in poll
self._maybe_connect(node_id)
File "/usr/local/lib/python3.6/site-packages/kafka/client_async.py", line 390, in _maybe_connect
conn.connect()
File "/usr/local/lib/python3.6/site-packages/kafka/conn.py", line 426, in connect
if self._try_handshake():
File "/usr/local/lib/python3.6/site-packages/kafka/conn.py", line 505, in _try_handshake
self._sock.do_handshake()
File "/usr/local/lib/python3.6/ssl.py", line 1077, in do_handshake
self._sslobj.do_handshake()
File "/usr/local/lib/python3.6/ssl.py", line 689, in do_handshake
self._sslobj.do_handshake()
OSError: [Errno 0] Error
The problem is Python 3.6 is returning OSError, which is not expected. Such exception is propagated to the caller and code making recycling of such connection is not executed. Therefore, Producer is guaranteed to get the same exception on a next call to poll().
Throwing of OSError doesn't seem to be documented even in latest Python docs. See 3.8 docs, but there are signs of it in 3.8 source code.
Here's a stack trace we had our logs flooded with.
The problem is Python 3.6 is returning OSError, which is not expected. Such exception is propagated to the caller and code making recycling of such connection is not executed. Therefore, Producer is guaranteed to get the same exception on a next call to
poll()
.Throwing of OSError doesn't seem to be documented even in latest Python docs. See 3.8 docs, but there are signs of it in 3.8 source code.
This change is