python-websockets / websockets

Library for building WebSocket servers and clients in Python
https://websockets.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
5.07k stars 506 forks source link

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) #1389

Closed WrongAnswertoAC closed 9 months ago

WrongAnswertoAC commented 11 months ago

Once in a while, I am encountering the following error:

"Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)"

This error occurs sporadically during the execution of my code that involves the use of your library. It seems to be related to a segmentation fault (SIGSEGV), but I am not able to pinpoint the exact cause.

I have tried to reproduce the error consistently, but it happens unpredictably and does not seem to be related to specific inputs or conditions. Unfortunately, I have not been able to find any related discussions or known issues in your library's GitHub repository.

from websockets.sync.client import connect

            ws = connect(
                url,
                additional_headers=headers,
            )
WrongAnswertoAC commented 11 months ago

We should fix this, although it happens sporadically, I have log statements wrapped over it , and whenever it happens, log before it is there in logs and after that, the process is exiting altogether.

WrongAnswertoAC commented 11 months ago

Noting that it also gives free(): corrupted unsorted chunks sometimes, let me know if I should open a separate issue for this.

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

WrongAnswertoAC commented 11 months ago

Noting that sometimes post successful connection whithout any exception, when we do ws.send(message) , it throws exception:

Traceback (most recent call last):
  File "/usr/local/python/python-3.10/std/lib64/python3.10/site-packages/websockets/sync/connection.py", line 666, in send_context
    self.send_data()
  File "/usr/local/python/python-3.10/std/lib64/python3.10/site-packages/websockets/sync/connection.py", line 728, in send_data
    self.socket.sendall(data)
  File "/opt/python/python-3.10/lib64/python3.10/ssl.py", line 1236, in sendall
    v = self.send(byte_view[count:])
  File "/opt/python/python-3.10/lib64/python3.10/ssl.py", line 1205, in send
    return self._sslobj.write(data)
ssl.SSLError: [SSL] internal error (_ssl.c:2396)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path_to_file/client.py", line 259, in _get_data
    ws.send(message)
  File "/usr/local/python/python-3.10/std/lib64/python3.10/site-packages/websockets/sync/connection.py", line 281, in send
    with self.send_context():
  File "/opt/python/python-3.10/lib64/python3.10/contextlib.py", line 142, in __exit__
    next(self.gen)
  File "/usr/local/python/python-3.10/std/lib64/python3.10/site-packages/websockets/sync/connection.py", line 711, in send_context
    raise self.protocol.close_exc from original_exc
websockets.exceptions.ConnectionClosedError: no close frame received or sent

Process finished with exit code 1

It is requested to find the root cause for the three errors reported in this issue and make it more stable and better,

aaugustin commented 11 months ago

It is requested to...

You misunderstand the relationship between an open-source maintainer who gives away software for free without any warranty (me) and users of that software (you). In this relationship, users aren't entitled to request anything from maintainers.

If you want to request work, you need a commercial support contract. Please reach out to me at aymeric.augustin@fractalideas.com and we'll arrange that.

... find the root cause for the three errors reported in this issue

Right now only you can find the root cause, since you are the only person (that we know of) who sees these issues.

Here's a good explanation of what you can do to debug the two segfaults: https://stackoverflow.com/questions/49414841/process-finished-with-exit-code-139-interrupted-by-signal-11-sigsegv

I understand that you don't have a minimal reproducible example: the issue is sporadic. I don't know if you can produce the issue while running under gdb.

If neither of these options work, to start isolating the segfaults, you could disable the C extension. Probably the easiest way to do this is to delete speedups.so from your install:

rm $VIRTUAL_ENV/lib/python*/site-packages/websockets/speedups.*.so

If you are still seeing segfaults after this, then websockets isn't causing them (because only pure Python code is running).

If this removes the segfaults, then potential causes include:

aaugustin commented 11 months ago

ssl.SSLError: [SSL] internal error (_ssl.c:2396)

This is almost certainly an issue with the ssl module; you could produce it with Python itself without websockets.

aaugustin commented 11 months ago

Also, it looks like you're using Python 3.10. If so, could you confirm that the issues still happen in Python 3.10.12? Maybe they were fixed.

WrongAnswertoAC commented 11 months ago

malloc(): unsorted double-linked list corrupted sporadically.

It also shows this, noting here so that all similar issues could be looked at in one place.

WrongAnswertoAC commented 11 months ago

I am able to reproduce all of the above, but it is not a minimal reproducer where it breaks with specific inputs.

I am running a loop to create and WebSocket connection in a loop 400 times. Of course, for each iteration, I am closing it eventually. I assume it should be reproducible like this with any sort of URL...

aaugustin commented 11 months ago

It isn't as easy to reproduce as you think. I created tens of thousands of connections in the same process many times (to check that websockets scaled) and never saw this.

aaugustin commented 9 months ago

Closing because I don't have enough info to reproduce. All I have is:

I am running a loop to create and WebSocket connection in a loop 400 times.

I've done this hundreds of times without ever seeing a segfault.