twisted / twisted

Event-driven networking engine written in Python.
https://twisted.org
Other
5.55k stars 1.17k forks source link

Twisted server 23.10.0 crashes on Python 3.9 with core file generated #12036

Open fmavb opened 10 months ago

fmavb commented 10 months ago

Twisted server crashes unexpectedly when I try to refresh a page. I am running Python 3.9.16 ( PYTHON VERSION: 3.9.16 (main, Sep 22 2023, 17:57:55) on Red Hat Enterprise Linux 8.8 and twisted == 23.10.0. I am using the select reactor and the web server. To reproduce, sometimes loading a page is enough to crash it, otherwise hitting refresh repeatedly in the browser is sufficient to crash it. This is the stack trace I am seeing in the core dump:

Traceback (most recent call first):
  File "/local/apps/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 749, in flush
    else:
  File "/local/apps/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 742, in _scheduledFlush
    @param wrappedFactory: A factory which will create the
  File "/local/apps/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 1090, in runUntilCurrent
    self._initThreadPool()
  File "/local/apps/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 705, in mainLoop

  File "/local/apps/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 695, in run
    self, name: str, timeout: Sequence[int] = (1, 3, 11, 45)

Output of pip freeze: apipkg==1.5 appdynamics==22.2.0.4918 appdynamics-bindeps-linux-x64==22.2.0 appdynamics-proxysupport-linux-x64==11.43.1 asgiref==3.7.2 asttokens==2.4.1 atomicwrites==1.4.0 attrs==21.4.0 Automat==20.2.0 backcall==0.2.0 bcrypt==3.2.0 behave==1.2.5 callee==0.3.1 certifi==2023.7.22 cffi==1.14.5 chardet==3.0.4 charset-normalizer==2.0.4 colorama==0.4.4 constantly==15.1.0 cov-core==1.15.0 coverage==5.5 cryptography==41.0.5 cx-Oracle==8.2.1 cycler==0.10.0 decorator==4.4.2 defusedxml==0.7.1 Django==4.2.6 djangorestframework==3.14.0 docopt==0.6.2 et-xmlfile==1.0.1 execnet==1.8.0 executing==2.0.1 gevent==23.9.1 greenlet==2.0.2 gssapi==1.8.3 gunicorn==20.1.0 hdfs==2.6.0 hyperlink==19.0.0 idna==2.10 importlib-metadata==3.7.2 importlib-resources==5.1.2 incremental==22.10.0 iniconfig==1.1.1 ipython==8.10.0 ipython-genutils==0.2.0 jdcal==1.4.1 jedi==0.19.1 Jinja2==3.0.3 kiwisolver==1.3.1 krb5==0.5.0 linecache2==1.0.0 MarkupSafe==2.0.1 matplotlib-inline==0.1.6 mock==4.0.3 nose==1.3.7 numpy==1.22.2 openpyxl==2.6.0 packaging==21.0 pandas==1.2.4 paramiko==2.12.0 parse==1.19.0 parse-type==0.5.2 parso==0.8.3 pem==21.2.0 pexpect==4.8.0 pickleshare==0.7.5 pluggy==0.13.1 prompt-toolkit==3.0.39 psutil==5.7.2 ptyprocess==0.7.0 pure-eval==0.2.2 py==1.10.0 pyasn1==0.4.6 pyasn1-modules==0.2.8 pycparser==2.20 PyExcelerate==0.10.0 pyface==7.3.0 Pygments==2.16.1 PyHamcrest==2.0.2 PyJWT==2.4.0 pymongo==3.5.1 PyNaCl==1.4.0 pyOpenSSL==23.2.0 pyparsing==2.4.7 pyspnego==0.10.2 pytest==6.2.4 pytest-cov==2.12.1 pytest-forked==1.3.0 pytest-xdist==2.3.0 python-dateutil==2.8.1 pytz==2021.1 PyYAML==5.4.1 pyzmq==22.0.3 requests==2.31.0 requests-kerberos==0.14.0 selenium==2.53.5 service-identity==18.1.0 simplegeneric==0.8.1 simplejson==3.8.0 six==1.15.0 SQLAlchemy==1.4.15 sqlparse==0.4.4 stack-data==0.6.3 toml==0.10.2 traceback2==1.4.0 traitlets==5.1.1 traits==6.2.0 traitsui==7.1.1 Twisted==23.10.0 typing_extensions==4.2.0 Unipath==1.1 unittest2==1.1.0 urllib3==1.26.18 wcwidth==0.2.9 xlrd==2.0.1 XlsxWriter==1.4.3 xlwt==1.3.0 zipp==3.4.1 zmq==0.0.0 zope.event==4.5.0 zope.interface==5.0.2

I believe #11989 could be a potential culprit of this bug.

Thanks, A.

adiroiban commented 10 months ago

Thanks for the report.


Can you also try with 23.8.0 ? Is your app working with an older version?

https://github.com/twisted/twisted/issues/11989 is indeed a supect :) but is best to double check.

If you app works with older Twisted, then most probably this is the issue.


Can you provide an example on how to reproduce this?

graingert commented 10 months ago

Can you put your traceback in backticks

Traceback ...
like this
```
glyph commented 10 months ago

@fmavb Your traceback is missing the bottom, which is the part that includes the error message and exception type, which would be extremely helpful in debugging this.

glyph commented 10 months ago

@fmavb I'm also noticing that the source code in your traceback is garbled. The lines appear to be reporting an exception that is legitimately occuring in an install of 23.10.0, but the actual lines of source code are from 22.10.0. This suggests that something in your installation environment is badly broken and it would be good to confirm with a fresh installation that you get a valid crash report.

glyph commented 10 months ago

@itamarst in the meanwhile, #11989 does seem likely to be implicated. Do you have any immediate thoughts?

itamarst commented 10 months ago

Just because that traceback is there doesn't mean that's the issue. It could be, I suppose, but I am not sure how it would break, given it's just doing slightly larger writes.

One thing that would be useful is setting the PYTHONFAULTHANDLER environment variable to 1, e.g. export PYTHONFAULTHANDLER=1, and then rerunning the reproducer. This will give nice stacktraces from all threads at segfault time.

glyph commented 10 months ago

There's also a lot of unrelated libraries loaded here, and given that it's a core dump, anything is possible. If at all possible, please minify your dependencies so we can see if Twisted is actually the thing crashing here.

fmavb commented 10 months ago

Hi,

Thank you very much for your prompt responses.

Firstly, the way we reproduce this bug is by refreshing a somewhat large page multiple times without letting it load completely. So, hitting refresh 3-4 times in a quick succession usually is enough to crash the server.

Secondly, we have reverted back to 22.10.0 and tried to reproduce the bug without any success. We tried upgrading to 23.8.0 as well and we were not able to reproduce it either. So it seems like the issue lies with 23.10.0. Please find the different stack traces below after we ran export PYTHONFAULTHANDLER=1

Exception object: <class 'ValueError'>
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 377, in _doRender
    self.doRender(request)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 428, in doRender
    return self.processConsoleRequest(request, userInfo_is_authorized.get('userInfo'))
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 1216, in processConsoleRequest
    self.renderConfig(request, userInfo)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 3670, in renderConfig
    request.write(bytes('<tr><td title="%s">%s</td><td class="wordbreak">%s</td>' % (overrideTitle, keyHtml, displayValue), 'utf-8'))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/server.py", line 267, in write
    http.Request.write(self, data)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 1247, in write
    self.channel.writeSequence(toChunk(data))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 2668, in writeSequence
    self.transport.writeSequence(iovec)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 797, in writeSequence
    self._aggregator.write(b"".join(sequence))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 727, in write
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 544, in _write
    self._flushSendBIO()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 2136, in bio_read
    self._handle_bio_errors(self._from_ssl, result)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 2111, in _handle_bio_errors
    raise ValueError("unknown bio failure")
Exception object: <class 'AttributeError'>
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 377, in _doRender
    self.doRender(request)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 428, in doRender
    return self.processConsoleRequest(request, userInfo_is_authorized.get('userInfo'))
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 1216, in processConsoleRequest
    self.renderConfig(request, userInfo)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 3670, in renderConfig
    request.write(bytes('<tr><td title="%s">%s</td><td class="wordbreak">%s</td>' % (overrideTitle, keyHtml, displayValue), 'utf-8'))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/server.py", line 267, in write
    http.Request.write(self, data)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 1247, in write
    self.channel.writeSequence(toChunk(data))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 2668, in writeSequence
    self.transport.writeSequence(iovec)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 797, in writeSequence
    self._aggregator.write(b"".join(sequence))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 727, in write
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 544, in _write
    self._flushSendBIO()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
Exception object: <class 'RuntimeError'>
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 377, in _doRender
    self.doRender(request)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 422, in doRender
    request.finish()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/server.py", line 277, in finish
    return http.Request.finish(self)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 1149, in finish
    raise RuntimeError(
2023-11-09 10:22:42,402 ERROR]<PoolThread-console_pool-1> Console Request Failed:  Error: 'NoneType' object has no attribute 'bio_read'
Exception object: <class 'AttributeError'>
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 377, in _doRender
    self.doRender(request)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 428, in doRender
    return self.processConsoleRequest(request, userInfo_is_authorized.get('userInfo'))
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 1216, in processConsoleRequest
    self.renderConfig(request, userInfo)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 3673, in renderConfig
    request.write(bytes('<td>%s</td></tr>' % link, 'utf-8'))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/server.py", line 267, in write
    http.Request.write(self, data)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 1247, in write
    self.channel.writeSequence(toChunk(data))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 2668, in writeSequence
    self.transport.writeSequence(iovec)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 797, in writeSequence
    self._aggregator.write(b"".join(sequence))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 727, in write
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 536, in _write
    self._tlsShutdownFinished(Failure())
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 385, in _tlsShutdownFinished
    self._flushSendBIO()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
free(): corrupted unsorted chunks
Fatal Python error: Aborted
2023-11-09 10:25:17,295 CRITICAL]<MainThread> Unhandled Error
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/server.py", line 446, in <module>
    startServer()
  File "/local/apps/plato_sit/plato/ws/server/server.py", line 352, in startServer
    reactor.run(installSignalHandlers = installSignalHandlers)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 695, in run
    self.mainLoop()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 705, in mainLoop
    self.runUntilCurrent()
--- <exception caught here> ---
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 1090, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 742, in _scheduledFlush
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 544, in _write
    self._flushSendBIO()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 2156, in bio_read
    self._handle_bio_errors(self._from_ssl, result)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 2134, in _handle_bio_errors
    _raise_current_error()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/OpenSSL/_util.py", line 57, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: []
malloc(): unsorted double linked list corrupted
Fatal Python error: Segmentation fault
(gdb) py-bt
Traceback (most recent call first):
  <built-in method SSL_write of _cffi_backend.Lib object at remote 0x7fe2e7c69f90>
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 2026, in send
    result = _lib.SSL_write(self._ssl, data, len(data))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 526, in _write
    sent = self._tlsConnection.send(toSend)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 727, in write
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/protocols/tls.py", line 797, in writeSequence
    self._aggregator.write(b"".join(sequence))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 2668, in writeSequence
    self.transport.writeSequence(iovec)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/http.py", line 1247, in write
    self.channel.writeSequence(toChunk(data))
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/web/server.py", line 267, in write
    http.Request.write(self, data)
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 3670, in renderConfig
    request.write(bytes('<tr><td title="%s">%s</td><td class="wordbreak">%s</td>' % (overrideTitle, keyHtml, displayValue), 'utf-8'))
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 1014, in processConsoleRequest
    request.redirect('/console?tab=67')
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 428, in doRender
    return self.processConsoleRequest(request, userInfo_is_authorized.get('userInfo'))
  File "/local/apps/plato_sit/plato/ws/server/console.py", line 377, in _doRender
    self.doRender(request)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/python/threadpool.py", line 285, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/python/threadpool.py", line 269, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/_threads/_team.py", line 192, in doWork
    task()
  File "/local/apps/plato_sit/venv/lib/python3.9/site-packages/twisted/_threads/_threadworker.py", line 49, in work
    task()
  File "/usr/lib64/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/usr/lib64/python3.9/threading.py", line 937, in _bootstrap
    self._bootstrap_inner()

Furthermore, we tried running our app with Python 3.11 and had the same bug, please find the trace below:

2023-11-09 10:48:34,862 CRITICAL]<MainThread> Unhandled Error
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/server.py", line 446, in <module>
    startServer()
  File "/local/apps/plato_sit/plato/ws/server/server.py", line 352, in startServer
    reactor.run(installSignalHandlers = installSignalHandlers)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 695, in run
    self.mainLoop()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 705, in mainLoop
    self.runUntilCurrent()
--- <exception caught here> ---
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1090, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 742, in _scheduledFlush
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 544, in _write
    self._flushSendBIO()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2136, in bio_read
    self._handle_bio_errors(self._from_ssl, result)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2114, in _handle_bio_errors
    _raise_current_error()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/OpenSSL/_util.py", line 57, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: []

2023-11-09 10:48:34,868 CRITICAL]<MainThread> Unhandled Error
Traceback (most recent call last):
  File "/local/apps/plato_sit/plato/ws/server/server.py", line 446, in <module>
    startServer()
  File "/local/apps/plato_sit/plato/ws/server/server.py", line 352, in startServer
    reactor.run(installSignalHandlers = installSignalHandlers)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 695, in run
    self.mainLoop()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 705, in mainLoop
    self.runUntilCurrent()
--- <exception caught here> ---
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1090, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 742, in _scheduledFlush
    self.flush()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 544, in _write
    self._flushSendBIO()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2136, in bio_read
    self._handle_bio_errors(self._from_ssl, result)
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2114, in _handle_bio_errors
    _raise_current_error()
  File "/local/apps/plato_sit/venv/lib/python3.11/site-packages/OpenSSL/_util.py", line 57, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: []

Please let me know if you need further information.

Kind regards, A.

adiroiban commented 10 months ago

Thanks for the followup.

Firstly, the way we reproduce this bug is by refreshing a somewhat large page multiple times without letting it load completely. So, hitting refresh 3-4 times in a quick succession usually is enough to crash the server.

Can you isolate this page in a script that so that we can execute and try to reproduce?

The TLS buffering might be a red herring that has just uncovered a bug that was present before in Python pyopenssl or cryptography

glyph commented 10 months ago

A minimal reproducer would be a good next step. It's possible that one of the many other libraries here is corrupting memory, although the most likely cause does seem like OpenSSL or something from PyCA.

Do you know how to use ASAN, or GWP-ASAN, or Electric Fence? The pyOpenSSL / cryptography developers are going to need some more memory debugging information to make this report actionable for them.

fmavb commented 10 months ago

Hi.

I was able to create a reproducer, I've uploaded it here: https://github.com/fmavb/twisted-bug

Please let me know if you need further information.

Kind regards, A.

glyph commented 10 months ago

Hi @fmavb,

Thanks for reducing the example.

  1. Can you make it self-executing? Like, can you put some twisted.web.client.Agent calls in a loop that cause it to crash for you?
  2. Just FYI, options.lowerMaximumSecurityTo = ssl.TLSVersion.TLSv1_3 has no effect. lowerMaximumSecurityTo is an argument to CertificateOptions's constructor, not an attribute of it: https://docs.twisted.org/en/twisted-23.8.0/api/twisted.internet.ssl.CertificateOptions.html#__init__. If you want to parameterize this you'll need pem to add support for it.
adiroiban commented 9 months ago

Many thanks for the repo. I was able to reproduce this with Ubuntu 23.10 and Python 3.11.

malloc(): unsorted double linked list corrupted
Segmentation fault (core dumped)

I also got the segfault with:

-        deferToThreadPool(reactor, reactor.getThreadPool(), _doRender)
+        reactor.callInThread(_doRender)

I got the segfault with both select reactor and epool reactor


Also, I placed a pdb on the exception block.

It look like

$ python server.py 
Unhandled Error
Traceback (most recent call last):
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/log.py", line 96, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/log.py", line 80, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
--- <exception caught here> ---
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/selectreactor.py", line 148, in _doReadOrWrite
    why = getattr(selectable, method)()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/tcp.py", line 248, in doRead
    return self._dataReceived(data)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/tcp.py", line 253, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 339, in dataReceived
    self._flushReceiveBIO()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 310, in _flushReceiveBIO
    self._flushSendBIO()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 263, in _flushSendBIO
    bytes = self._tlsConnection.bio_read(2**15)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2156, in bio_read
    self._handle_bio_errors(self._from_ssl, result)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2134, in _handle_bio_errors
    _raise_current_error()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/_util.py", line 57, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: []

> /home/adi/chevah/twisted-bug/server.py(21)_doRender()
-> return ''
(Pdb) list
 16             def _doRender():
 17                 try:
 18                     self.doRender(request)
 19                 except Exception as error:
 20                     import pdb; import sys; sys.stdout = sys.__stdout__; pdb.set_trace()
 21  ->                 return ''
 22                 finally:
 23                     if not request._disconnected:
 24                         request.finish()
 25  
 26             reactor.callInThread(_doRender)
(Pdb) error
AttributeError("'NoneType' object has no attribute 'bio_read'")

The repo has twisted-iocpsupport inside. Do you run this on Windows?


It looks like a threading issue. I tried to "simulate" the threads with deferLater, but the code works. No exceptions, no sefault.

-        deferToThreadPool(reactor, reactor.getThreadPool(), _doRender)
+        deferLater(reactor, 0.000001, _doRender)

or

-        deferToThreadPool(reactor, reactor.getThreadPool(), _doRender)
+        #reactor.callInThread(_doRender)
+        deferLater(reactor, random.randint(1, 100)/100000, _doRender)

It looks like it's related to the way the connections are closed.

Sometimes I can see errors like:

$ python server.py 
Unhandled Error
Traceback (most recent call last):
  File "/home/adi/chevah/twisted-bug/server.py", line 42, in <module>
    reactor.run()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 695, in run
    self.mainLoop()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 705, in mainLoop
    self.runUntilCurrent()
--- <exception caught here> ---
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1090, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 742, in _scheduledFlush
    self.flush()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 536, in _write
    self._tlsShutdownFinished(Failure())
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 392, in _tlsShutdownFinished
    self.transport.loseConnection()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/_newtls.py", line 204, in loseConnection
    FileDescriptor.loseConnection(self)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/abstract.py", line 412, in loseConnection
    self.stopReading()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/abstract.py", line 426, in stopReading
    self.reactor.removeReader(self)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 178, in removeReader
    self._remove(
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 169, in _remove
    primary.remove(fd)
builtins.KeyError: 11
adiroiban commented 9 months ago

To reproduce, I am usung this client code ... it usually fails after 4 connections with epoll as the epoll is so much faster than select

It's much harder to reproduce with the select reactor, which fails maybe after 40 connections.

import requests

url = 'https://127.0.0.1:8080'
try:
    for count in range(100):
        with requests.get(url, stream=True, verify=False) as r:
            i = r.iter_content(chunk_size=100)
except Exception:
    print(f'Failing after {count}')
adiroiban commented 9 months ago

with 23.8.0 there is no segfault but I get this error

Unhandled Error
Traceback (most recent call last):
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/log.py", line 80, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/posixbase.py", line 482, in _doReadOrWrite
    why = selectable.doRead()
--- <exception caught here> ---
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/tcp.py", line 1407, in doRead
    transport = self.transport(
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/tcp.py", line 810, in __init__
    self.startReading()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/abstract.py", line 438, in startReading
    self.reactor.addReader(self)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 117, in addReader
    self._add(
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 103, in _add
    self._poller.modify(fd, flags)
builtins.FileNotFoundError: [Errno 2] No such file or directory

or this one

Unhandled Error
Traceback (most recent call last):
  File "/home/adi/chevah/twisted-bug/server.py", line 41, in <module>
    reactor.run()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 693, in run
    self.mainLoop()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 703, in mainLoop
    self.runUntilCurrent()
--- <exception caught here> ---
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1088, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/http.py", line 2581, in forceAbortClient
    self.transport.abortConnection()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 430, in abortConnection
    self.transport.abortConnection()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/tcp.py", line 199, in abortConnection
    self.stopWriting()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/abstract.py", line 434, in stopWriting
    self.reactor.removeWriter(self)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 189, in removeWriter
    self._remove(
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 166, in _remove
    del selectables[fd]
builtins.KeyError: 10

So I guess the error was always there, it's just that with bigger writes, we get a segfault.

adiroiban commented 9 months ago

with faulthandler enabled ...

diff --git a/server.py b/server.py
index 5de7184..c400a1b 100644
--- a/server.py
+++ b/server.py
@@ -1,8 +1,12 @@
+import faulthandler
+faulthandler.enable()
 import sys
$ python server.py 
malloc(): unsorted double linked list corrupted
Fatal Python error: Segmentation fault

Current thread 0x00007f9e47cff6c0 (most recent call first):
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2026 in send
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.pyFatal Python error: "Aborted, line 

526 in _write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748 in flush
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 727 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 797 in writeSequence
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/http.py", line 2668 in writeSequence
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/http.py", line 1247 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/server.py", line 267 in write
  File "/home/adi/chevah/twisted-bug/server.py", line 36 in doRender
  File "/home/adi/chevah/twisted-bug/server.py", line 22 in _doRender
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 82 in callWithContext
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 117 in callWithContext
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/threadpool.py", line 285 in <lambda>
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/threadpool.py", line 269 in inContext
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/_threads/_team.py", line 192 in doWork
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/_threads/_threadworker.py", line 49 in work
  File "/usr/lib/python3.11/threading.py", line 982 in run
  File "/usr/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "
Extension modules: /usr/lib/python3.11/threading.pyzope.interface._zope_interface_coptimizations", line 1002 in _bootstrap

Thread 0x00007f9e4a89d040 (most recent call first):
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2153 in bio_read
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 263 in _flushSendBIO
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 544 in _write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748 in flush
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 742 in _scheduledFlush
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1090 in runUntilCurrent
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 705 in mainLoop
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 695 in run
  File "/home/adi/chevah/twisted-bug/server.py", line 45 in <module>
, _cffi_backend
Extension modules:  (total: zope.interface._zope_interface_coptimizations2)
Aborted (core dumped)

zope is red-hering. I removed the zope optimization $ rm venv/lib/python3.11/site-packages/zope/interface/_zope_interface_coptimizations.c* and got this error

$ python server.py 
malloc(): unsorted double linked list corrupted
Fatal Python error: Segmentation fault

Current thread 0x00007f490b8146c0 (most recent call first):
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2026 in send
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 526 in _write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py"Fatal Python error: , line Aborted748

 in flush
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 727 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 797 in writeSequence
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/http.py", line 2668 in writeSequence
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/http.py", line 1247 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/web/server.py", line 267 in write
  File "/home/adi/chevah/twisted-bug/server.py", line 36 in doRender
  File "/home/adi/chevah/twisted-bug/server.py", line 22 in _doRender
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 82 in callWithContext
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/context.py", line 117 in callWithContext
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/threadpool.py", line 285 in <lambda>
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/python/threadpool.py", line 269 in inContext
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/_threads/_team.py", line 192 in doWork
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/_threads/_threadworker.py", line 49 in work
  File "/usr/lib/python3.11/threading.py", line 982 in run
  File "/usr/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/usr/lib/python3.11/threading.py", line 1002 in _bootstrap

Thread 0x00007f490e3ea040 (most recent call first):
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/OpenSSL/SSL.py", line 2153 in bio_read
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 263 in _flushSendBIO
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 544 in _write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467 in write
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748 in flush
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 742 in _scheduledFlush
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1090 in runUntilCurrent
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 705 in mainLoop
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 695 in run
  File "/home/adi/chevah/twisted-bug/server.py", line 45 in <module>

Extension modules: _cffi_backend (total: 1)
Aborted (core dumped)
itamarst commented 9 months ago

@fmavb calling Twisted APIs from the non-reactor thread is unsupported and you should never do it. So fix that and the problem will go away on your side. It's still worth fixing the segfault since that shouldn't happen either, but what you're doing in the example is fundamentally unsupported.

glyph commented 9 months ago

Given that the reproducer illegally writes to the reactor from a thread, I am inclined to close. The real issue here, if there is one, would be that we might want to have a TWISTED_CHECK_THREAD_SAFETY env var or something that would add the potentially expensive instrumentation around all the reactor APIs to raise exceptions if you do this. @fmavb is the thread present in your application or was that just supposed to be for demonstration purposes?

glyph commented 9 months ago

It's still worth fixing the segfault since that shouldn't happen either

If this is a thread-safety issue inside OpenSSL, it's not clear that there's much we could do without paranoid main-thread checks on every interaction with its API. I wouldn't want to add that without pretty good evidence that its overhead was low.

fmavb commented 9 months ago

Hi.

Thank you very much for your responses.

Could you please elaborate on what you mean by saying: "calling Twisted APIs from the non-reactor thread is unsupported and you should never do it"? Could you please highlight the lines of code that are causing this issue?

Thank you!

Kind regards, A.

adiroiban commented 9 months ago

Could you please elaborate on what you mean by saying: "calling Twisted APIs from the non-reactor thread is unsupported and you should never do it"? Could you please highlight the lines of code that are causing this issue?

The main reason why Twisted exists it to make it easy to implement concurrent IO processing without threads.

request.write() and request.finish() should not be called from arbitrary threads.


You can use reactor.callInThread(your_thread_handling) but the your_thread_handling should not use any of the Twisted API. It is designed for things like using sqlite API and then once the result is ready, you can use reactor.callFromThread to set the value into some shared memory.

Or just use threads.deferToThread(your_thread_handling) to "spawn" and retrieve the result in a thread safe way. You your_thread_handling can return a value, but should not call any Twisted API.

More info https://docs.twisted.org/en/latest/core/howto/threading.html

adiroiban commented 9 months ago

I would be nice to investigate the segfault error.

Note that even if threads are not used,

-        deferToThreadPool(reactor, reactor.getThreadPool(), _doRender)
+        deferLater(reactor, 0.000001, _doRender)

the error is

$ python server.py 
Unhandled Error
Traceback (most recent call last):
  File "/home/adi/chevah/twisted-bug/server.py", line 42, in <module>
    reactor.run()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 695, in run
    self.mainLoop()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 705, in mainLoop
    self.runUntilCurrent()
--- <exception caught here> ---
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/base.py", line 1090, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 742, in _scheduledFlush
    self.flush()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 748, in flush
    self._write(b"".join(self._buffer))
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 467, in write
    self._write(bytes)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 536, in _write
    self._tlsShutdownFinished(Failure())
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/protocols/tls.py", line 392, in _tlsShutdownFinished
    self.transport.loseConnection()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/_newtls.py", line 204, in loseConnection
    FileDescriptor.loseConnection(self)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/abstract.py", line 412, in loseConnection
    self.stopReading()
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/abstract.py", line 426, in stopReading
    self.reactor.removeReader(self)
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 178, in removeReader
    self._remove(
  File "/home/adi/chevah/twisted-bug/venv/lib/python3.11/site-packages/twisted/internet/epollreactor.py", line 169, in _remove
    primary.remove(fd)
builtins.KeyError: 11
glyph commented 9 months ago

Note that even if threads are not used,

Okay now we're cooking

exarkun commented 8 months ago

Is the most recent version of the reproducer, without threads, available as a single complete unit somewhere?

adiroiban commented 8 months ago

For reference here is the server code

"""
This is some code to help reproduce the bug reported at
https://github.com/twisted/twisted/issues/12036

Don't use this code in production.

pip install pem
pip install -e .[tls]
"""
# Set to `True` to force using the select reactor,
# otherwise, the default reactor is used.
USE_SELECT = True
# Set to `True`` to force the invalid usage of threads
USE_THREADS = False

import os
import random
import sys
import faulthandler
faulthandler.enable()

if USE_SELECT:
    from twisted.internet import selectreactor
    if not 'twisted.internet.reactor' in sys.modules:
        selectreactor.install()

from twisted.web import server, resource
from twisted.internet import reactor, task, ssl
from twisted.internet.threads import deferToThreadPool
from twisted.web.server import NOT_DONE_YET

from pem.twisted import certificateOptionsFromFiles

key = """
-----BEGIN PRIVATE KEY-----
MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQDFJH6YYZY9X+cJ
N0AS72ogJ6kCGf0V9BslABUfp9sU3Gohc+b6STM9d0xrwUlvD4+yDu5ZmCojdr9b
CqJZYVdwq66Z/LAthIUIlOZODb4FgBOcwWnrwn6mueGPe5kEmS4AoPDD4IQFQcWv
bMlQB2xkn/gHNqJIyG5FnqD9MujBvo0cCaXXV5ccTojhtPWqdmR8iSsBORe1Gz7W
rZVbuard/6SUbMdEmVdlMpHxIzl6SYhIv/UkEfzMU6p1z80VQCub7RWgU0faTf3V
7bwelBDMM0l/uz97iaMp80OEbK69bTj7AVal96UqqGQvY4J2wU2w4C+kJW174CFz
6QUC8TAxAgMBAAECggEACOjm6iL3m+NT89bB/VPB3N1WANY6WeM7/9PYCzkg8w7U
IsS+zWnWAu2Q8AoetBeEs8mDdwakGZFiFsQCsDNF9P25XinCr51JOD5R204Ohr1H
WvgN013UBLEqyHs6SvApBAEXaI7i1exR+iVFQunsTu7rki3jWefpve6W/DfbDPh9
TWvWdFRTEQD9jULaCYQ33MwD1m2c6JblP6oOi4rfZIA7KcsSCfwoXkNI+CWMRmAR
APSjLb5EgsT/VOlPQCdF3OCEyWfSdK6SolatkdCYTWey+nE9S3D564hacOgtBzcc
72l4bUDoVTZoyDSLGZ14zcU0dw3sxp+EGQKzRBH7jQKBgQDokY38/QFUEehPneTv
UUYRfU0tiiiyZv3Gm2XfB2L4Ghl3FNzHhkBjo2jxWAMyZL1iwAPXgULNFfSOO6em
W+8jX9RyndPbXgdkPdOV5hspTneK/nQi0DRddjZ9ZECbOdKRIgyikONs3QIkVWYx
iNsrNHZc9coOYGjCfMhjM3YDpQKBgQDZATn51pbR8IQCM4k2YQ+Ixu95GgafH8MX
e1liMfX5SCfsOguCIFtp6hRtYwLKyK5kCS/rylANzorkmkUQC9kzuaO9XtMv2qXy
4Jj6yF1owyxHklJgvUIngEXxAjxJnVV0e+ZkOuC6izIhZ1cRxzGH3qI6M9KGAQsy
z7uMnmrknQKBgCdwl62xkkoZh1YJJsvW8pUAtNfmOCSZBqxpqgr6i7QbaPoi1ZYY
lE7RZXiYPmKLZvqaKYygZR22fXSpdTLnIFTVaYfOcCEhfwtm2Lzw+uY0Yl117WgB
+v9mjrMRD0vDsIvgRFhFTewmRri2qs/3r4NSMEhvfkaV3u4sjycTkQ3pAoGBAJcF
BHz7u0BlmukRRpeCmtX8ZGIqG5uX9iScYks1K9KNRcNfbezkadne3tJ1ilbnrH3P
6vq0dFv3dq+7MP8Gz+Twd6Rtnpl6F+1cIvq8gE1NLr6sdQoADE8DNSmR/bTZesQz
CdaNJyKLhpq6mVy0/KJq7QmGUSeTCGEXMpWdFhT9AoGBANXP5QHlKVDysP+RZ21a
Np9ny/c8Pmjf2k684jEaHVeAPCsfjqkfot3tbc1i+0FaHn77fTarxgRmksv0iTzr
CngShr3B/GrL2tjNrMs9CzAdSpvaXtayo+bKVExzS9Q8bCfigt5+gO6Adrhqf8Lr
mFhvv+rjGT5pRFQiSYJbKRqs
-----END PRIVATE KEY-----
"""

cert = """
-----BEGIN CERTIFICATE-----
MIIDazCCAlOgAwIBAgIUdVCC5OktcYITyzWJLTZ9iOsloEQwDQYJKoZIhvcNAQEL
BQAwRTELMAkGA1UEBhMCQVUxEzARBgNVBAgMClNvbWUtU3RhdGUxITAfBgNVBAoM
GEludGVybmV0IFdpZGdpdHMgUHR5IEx0ZDAeFw0yMzExMTcxMjAwNDhaFw0yNDEx
MTYxMjAwNDhaMEUxCzAJBgNVBAYTAkFVMRMwEQYDVQQIDApTb21lLVN0YXRlMSEw
HwYDVQQKDBhJbnRlcm5ldCBXaWRnaXRzIFB0eSBMdGQwggEiMA0GCSqGSIb3DQEB
AQUAA4IBDwAwggEKAoIBAQDFJH6YYZY9X+cJN0AS72ogJ6kCGf0V9BslABUfp9sU
3Gohc+b6STM9d0xrwUlvD4+yDu5ZmCojdr9bCqJZYVdwq66Z/LAthIUIlOZODb4F
gBOcwWnrwn6mueGPe5kEmS4AoPDD4IQFQcWvbMlQB2xkn/gHNqJIyG5FnqD9MujB
vo0cCaXXV5ccTojhtPWqdmR8iSsBORe1Gz7WrZVbuard/6SUbMdEmVdlMpHxIzl6
SYhIv/UkEfzMU6p1z80VQCub7RWgU0faTf3V7bwelBDMM0l/uz97iaMp80OEbK69
bTj7AVal96UqqGQvY4J2wU2w4C+kJW174CFz6QUC8TAxAgMBAAGjUzBRMB0GA1Ud
DgQWBBQKhIrq8GQnV5O9XCFeTSPLgi137jAfBgNVHSMEGDAWgBQKhIrq8GQnV5O9
XCFeTSPLgi137jAPBgNVHRMBAf8EBTADAQH/MA0GCSqGSIb3DQEBCwUAA4IBAQBb
YXBAL3+yRXh5ahGZDO4RxkCyYv6k+sa6hxRJo0xoJ3/ORhDTnUy7f0TOZd9uOCr3
2ciTdCFZokLZdoI1uT0NIc3eHcWXe4SLGM5/99951gRnnuq/qAbxJh6t+P9z6buy
DiBqZoiN9/UBPK0BhmlIKv067BwiuzN1IbQEB4q50K/aUfCb14kUPhE4VRn63QiH
X7WxWigsVpIqI/PJkjIB52kxelycXysTOfMdIzBTp3VY/j74o64YNq3/6tKpS0Ip
sW451kqIRXsmKfMrmn5Ph8nAfzZupzMlshNvNadf2EmKIpSs1lqBm1wI450y5sYs
fZPLcH//fZCtXUR63xWV
-----END CERTIFICATE-----
"""

class Simple(resource.Resource):
    isLeaf = True

    def render_GET(self, request):
        def _doRender():
            try:
                self.doRender(request)
            except:
                return ''
            finally:
                if not request._disconnected:
                    request.finish()

        if USE_THREADS:
            # I don't know why the issue is more common when
            # deferToThreadPool is used.
            #reactor.callInThread(_doRender)
            deferToThreadPool(reactor, reactor.getThreadPool(), _doRender)
        else:
            task.deferLater(reactor, random.randint(1, 100)/100000, _doRender)
        return NOT_DONE_YET

    def doRender(self, request):
        request.write(bytes('<html><body>', 'utf-8'))
        request.write(bytes('<table><tr><th>Key</th><th>Value</th></tr>', 'utf-8'))
        for i in range(30000):
            request.write(bytes(f'<tr><td>{i}</td><td>{i}</td></tr>', 'utf-8'))
        request.write(bytes('</body></html>', 'utf-8'))
        return b''

open('cert-12036.pem', 'w').write(cert)
open('key-12036.pem', 'w').write(key)

site = server.Site(Simple())
options = certificateOptionsFromFiles('key-12036.pem', 'cert-12036.pem')
options.lowerMaximumSecurityTo = ssl.TLSVersion.TLSv1_3
reactor.listenSSL(8080, site, options)
if USE_SELECT:
    print('Using `select` reactor.')
else:
    print(f'Using default {reactor.__class__} reactor')

if USE_THREADS:
    print('Running server with incorrect usage of threads')
else:
    print('Running the server without threads.')

print('Server listening on 8080')
try:
    reactor.run()
finally:
    os.unlink('cert-12036.pem')
    os.unlink('key-12036.pem')
adiroiban commented 8 months ago

and here is the client code

"""
Get the deps:

pip install requests

Run `python server-12036.py` in a separate terminal.
Then run `python client-12036.py`

Wait a bit and see the the client process stopped.
Check the server process.

"""
import requests

url = 'https://127.0.0.1:8080'
try:
    for count in range(1000):
        with requests.get(url, stream=True, verify=False) as r:
            i = r.iter_content(chunk_size=100)
except Exception as error:
    print(f'Failing after {count } {error}')
adiroiban commented 8 months ago

With the incorrect usage of threads I get errors like


Unhandled Error
Traceback (most recent call last):
  File "/home/adi/chevah/twisted/src/twisted/python/log.py", line 96, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/adi/chevah/twisted/src/twisted/python/log.py", line 80, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/adi/chevah/twisted/src/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/adi/chevah/twisted/src/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
--- <exception caught here> ---
  File "/home/adi/chevah/twisted/src/twisted/internet/selectreactor.py", line 148, in _doReadOrWrite
    why = getattr(selectable, method)()
  File "/home/adi/chevah/twisted/src/twisted/internet/abstract.py", line 248, in doWrite
    self.dataBuffer = _concatenate(
  File "/home/adi/chevah/twisted/src/twisted/internet/abstract.py", line 30, in _concatenate
    return b"".join([memoryview(bObj)[offset:]] + bArray)
builtins.MemoryError: 

Today, I have executed the manual tests again, and when threads are not used, I was no longer able to reproduce the select connection closed errors