Closed calee88 closed 7 months ago
Do you see any related log messages on the Logstash server?
Do you know if this happened already before 2.7.0? If possible either downgrade to 2.6.0 or set constants.SOCKET_CLOSE_WAIT_TIMEOUT
to 0
, whatever is easier for you.
In 2.7.0 a new feature has been added to wait for the send buffer to be empty when shutting down. This might be related, though this is just a guess.
I tried this solution by the following lines:
from logstash_async.constants import constants constants.SOCKET_CLOSE_WAIT_TIMEOUT=0
but still getting this error:
File \"/python3.11/lib/python3.11/site-packages/logstash_async/worker.py\", line 241, in _flush_queued_events self._send_events(events) File \"/python3.11/lib/python3.11/site-packages/logstash_async/worker.py\", line 304, in _send_events self._transport.send(events, use_logging=use_logging) File \"/python3.11/lib/python3.11/site-packages/logstash_async/transport.py\", line 103, in send self._close() File \"/python3.11/lib/python3.11/site-packages/logstash_async/transport.py\", line 137, in _close self._sock.shutdown(socket.SHUT_WR)
function: _safe_log_impl name: worker.py original: An error occurred while sending events: [Errno 107] Transport endpoint is not connected type: OSError
Using Debian trixie-slim with sqlite3 installed and python 3.11.6
Unfortunately, constants.SOCKET_CLOSE_WAIT_TIMEOUT=0
doesn't exactly disable the feature completely. This is fixed in master.
Anyway, I just reread the OP and noticed the second exception "BrokenPipeError: [Errno 32] Broken pipe". I think this is main problem and the "OSError: [Errno 107] Transport endpoint is not connected" is only a follow up error when trying to close the connection. I will probably wrap the connection shutdown code in a try-except block.
Still, this won't solve the broken pipe errors. I'm pretty sure this is not caused by the code but by the network connection on your side.
I have the same issue in a docker environment. Interestingly the errors do not stop even when the logstash container gets back up. Is there any workaround until a fix is delivered? Sample logs (this repeats every 30s):
2023-12-23 10:36:22 An error occurred while sending events: [Errno 107] Transport endpoint is not connected 2023-12-23 10:36:22 Traceback (most recent call last): 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/worker.py", line 138, in _fetch_events 2023-12-23 10:36:22 self._fetch_event() 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/worker.py", line 160, in _fetch_event 2023-12-23 10:36:22 self._event = self._queue.get(block=False) 2023-12-23 10:36:22 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/queue.py", line 168, in get 2023-12-23 10:36:22 raise Empty 2023-12-23 10:36:22 _queue.Empty 2023-12-23 10:36:22 2023-12-23 10:36:22 During handling of the above exception, another exception occurred: 2023-12-23 10:36:22 2023-12-23 10:36:22 Traceback (most recent call last): 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/transport.py", line 101, in send 2023-12-23 10:36:22 self._send(events) 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/transport.py", line 118, in _send 2023-12-23 10:36:22 self._send_via_socket(event) 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/transport.py", line 226, in _send_via_socket 2023-12-23 10:36:22 self._sock.sendall(data_to_send) 2023-12-23 10:36:22 BrokenPipeError: [Errno 32] Broken pipe 2023-12-23 10:36:22 2023-12-23 10:36:22 During handling of the above exception, another exception occurred: 2023-12-23 10:36:22 2023-12-23 10:36:22 Traceback (most recent call last): 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/worker.py", line 241, in _flush_queued_events 2023-12-23 10:36:22 self._send_events(events) 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/worker.py", line 304, in _send_events 2023-12-23 10:36:22 self._transport.send(events, use_logging=use_logging) 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/transport.py", line 103, in send 2023-12-23 10:36:22 self._close() 2023-12-23 10:36:22 File "/usr/local/lib/python3.12/site-packages/logstash_async/transport.py", line 137, in _close 2023-12-23 10:36:22 self._sock.shutdown(socket.SHUT_WR) 2023-12-23 10:36:22 OSError: [Errno 107] Transport endpoint is not connected
I have no workaround yet except restarting the application.
when the logstash container gets back up
This sounds like you know when it happens? Is it just an unavailable Logstash server?
I hope I will find time to fix it next week.
This sounds like you know when it happens? Is it just an unavailable Logstash server?
Yes, I control both containers, so I could reset the ones dependent on Logstash. It's a matter of resilience and flexibility I guess ;)
I hope I will find time to fix it next week.
Please do :)
We think this happens because socket not properly closing with _close(), because socket already closed and sending shutdown
in closed socket raises OSError, next code, self._sock.close()
and self._sock = None
not reacheble, and next time when you try to send smth, socket not recreated in _create_socket because it's not empty
We fixed with try...finally for us in out transport:
try:
super()._close(force)
finally:
self._sock = None
ways for reproduce:
@izverev thanks for the investigation and steps to reproduce.
I added basically your change in #92. The errors are still thrown when trying to close a broken or not connected socket connection but it will recover the next time as the socket instance is dropped instead of re-used.
@izverev @x0zzz @OliverNickel @calee88 it would be great if you could test it.
I can confirm that it works - more or less. It still keeps spamming the error message
An error occurred while sending events: [Errno 107] Transport endpoint is not connected
File "/python3.12/lib/python3.12/site-packages/logstash_async/worker.py", line 241, in _flush_queued_events
self._send_events(events)
File "/python3.12/lib/python3.12/site-packages/logstash_async/worker.py", line 304, in _send_events
self._transport.send(events, use_logging=use_logging)
File "/python3.12/lib/python3.12/site-packages/logstash_async/transport.py", line 103, in send
self._close()
File "/python3.12/lib/python3.12/site-packages/logstash_async/transport.py", line 138, in _close
self._sock.shutdown(socket.SHUT_WR)
Maybe because we are using UDP?
logstash_async.transport.UdpTransport
I can confirm that it works - more or less. It still keeps spamming the error message
An error occurred while sending events: [Errno 107] Transport endpoint is not connected
I did not suppress the error only the handling of it so that the Transport
instance will recover from the error by creating a new socket connection.
My idea was that it is generally better to log these errors than hiding them away. Do this happen so often that it bothers you or did you just notice while testing?
Maybe because we are using UDP?
I think UDP/TCP does not matter here, the code is the same for both protocols.
I'm getting "OSError: [Errno 107] Transport endpoint is not connected" every 30 seconds. It is interesting because, the messages are logged in logstash index. Is it possible that the endpoint is not connected and still logging the message? And other messages seem to be logged as well. Here is the stack trace:
function name: _safe_log_impl path: logstash_async/worker.py logstash_async_version: 2.7.2