cherrypy / cheroot

Cheroot is the high-performance, pure-Python HTTP server used by CherryPy. Docs -->
https://cheroot.cherrypy.dev
BSD 3-Clause "New" or "Revised" License
183 stars 90 forks source link

SABnzbd client - ENGINE Error in HTTP server: shutting down #710

Open ZoltrixGFC opened 3 months ago

ZoltrixGFC commented 3 months ago

❓ I'm submitting a ...

🐞 Describe the bug. What is the current behavior? Using SABnzbd 4.3.1 on Windows 11 Enterprise 23H2. Recently it has started to hang, displaying the following error in the logs just before it does hang:

Please, report this on the Cheroot tracker at <https://github.com/cherrypy/cheroot/issues/new/choose>, providing a full reproducer with as much context and details as possible.
Traceback (most recent call last):
  File "cheroot\workers\threadpool.py", line 119, in run
  File "cheroot\workers\threadpool.py", line 216, in _process_connections_until_interrupted
  File "cheroot\server.py", line 1364, in close
  File "cheroot\server.py", line 1484, in _close_kernel_socket
  File "ssl.py", line 1291, in shutdown
OSError: [WinError 10022] An invalid argument was supplied

2024-05-12 16:40:03,087::ERROR::[_cplogging:213] [12/May/2024:16:40:03] ENGINE Error in HTTP server: shutting down
Traceback (most recent call last):
  File "cherrypy\process\servers.py", line 225, in _start_http_thread
  File "cheroot\server.py", line 1845, in start
  File "cheroot\server.py", line 1833, in serve
  File "threading.py", line 1073, in _bootstrap_inner
  File "cheroot\workers\threadpool.py", line 119, in run
  File "cheroot\workers\threadpool.py", line 216, in _process_connections_until_interrupted
  File "cheroot\server.py", line 1364, in close
  File "cheroot\server.py", line 1484, in _close_kernel_socket
  File "ssl.py", line 1291, in shutdown
OSError: [WinError 10022] An invalid argument was supplied

πŸ’‘ To Reproduce Queue up downloads in the SABnzbd client, after a period of time the client will hang and stop responding.

πŸ’‘ Expected behavior Client not hang.

πŸ“‹ Details Logs shown above.

πŸ“‹ Environment

Safihre commented 3 months ago

cheroot==10.0.1 cherrypy==18.9.0

Python version we use is 3.12.3, not 3.8 :)

ZoltrixGFC commented 3 months ago

Just adding a more info.

It had been sitting idle for a while, then threw out the errors below and crashed.

2024-05-12 20:36:11,033::INFO::[postproc:138] Saving postproc queue
2024-05-12 20:36:11,033::INFO::[downloader:409] Post-processing finished, resuming download
2024-05-12 20:36:15,047::INFO::[nzbqueue:230] Saving queue
2024-05-12 20:36:15,049::INFO::[postproc:138] Saving postproc queue
2024-05-12 20:36:15,051::INFO::[notifier:157] Sending notification: SABnzbd - Queue finished (type=queue_done, job_cat=None)
2024-05-12 20:36:21,625::INFO::[_cplogging:213] [12/May/2024:20:36:21] ENGINE Client ('192.168.0.26', 58176) lost   peer dropped the TLS connection suddenly, during handshake: (8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')
2024-05-12 20:36:21,628::INFO::[_cplogging:213] [12/May/2024:20:36:21] ENGINE Client ('192.168.0.26', 58164) lost   peer dropped the TLS connection suddenly, during handshake: (8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')
2024-05-12 20:37:29,339::INFO::[database:208] [N/A] Moved job SABnzbd_nzo_2a1nfhb6 to archive
2024-05-12 21:10:33,102::INFO::[notifier:157] Sending notification: Error - [12/May/2024:21:10:33] ENGINE A fatal exception happened. Setting the server interrupt flag to OSError(22, 'An invalid argument was supplied', None, 10022, None) and giving up.

Please, report this on the Cheroot tracker at <https://github.com/cherrypy/cheroot/issues/new/choose>, providing a full reproducer with as much context and details as possible.
Traceback (most recent call last):
  File "cheroot\workers\threadpool.py", line 119, in run
  File "cheroot\workers\threadpool.py", line 216, in _process_connections_until_interrupted
  File "cheroot\server.py", line 1364, in close
  File "cheroot\server.py", line 1484, in _close_kernel_socket
  File "ssl.py", line 1291, in shutdown
OSError: [WinError 10022] An invalid argument was supplied
 (type=error, job_cat=None)
2024-05-12 21:10:33,102::CRITICAL::[_cplogging:213] [12/May/2024:21:10:33] ENGINE A fatal exception happened. Setting the server interrupt flag to OSError(22, 'An invalid argument was supplied', None, 10022, None) and giving up.

Please, report this on the Cheroot tracker at <https://github.com/cherrypy/cheroot/issues/new/choose>, providing a full reproducer with as much context and details as possible.
Traceback (most recent call last):
  File "cheroot\workers\threadpool.py", line 119, in run
  File "cheroot\workers\threadpool.py", line 216, in _process_connections_until_interrupted
  File "cheroot\server.py", line 1364, in close
  File "cheroot\server.py", line 1484, in _close_kernel_socket
  File "ssl.py", line 1291, in shutdown
OSError: [WinError 10022] An invalid argument was supplied

2024-05-12 21:10:33,250::ERROR::[_cplogging:213] [12/May/2024:21:10:33] ENGINE Error in HTTP server: shutting down
Traceback (most recent call last):
  File "cherrypy\process\servers.py", line 225, in _start_http_thread
  File "cheroot\server.py", line 1845, in start
  File "cheroot\server.py", line 1833, in serve
  File "threading.py", line 1073, in _bootstrap_inner
  File "cheroot\workers\threadpool.py", line 119, in run
  File "cheroot\workers\threadpool.py", line 216, in _process_connections_until_interrupted
  File "cheroot\server.py", line 1364, in close
  File "cheroot\server.py", line 1484, in _close_kernel_socket
  File "ssl.py", line 1291, in shutdown
OSError: [WinError 10022] An invalid argument was supplied

2024-05-12 21:10:33,250::INFO::[_cplogging:213] [12/May/2024:21:10:33] ENGINE Bus STOPPING
2024-05-12 21:10:33,354::INFO::[_cplogging:213] [12/May/2024:21:10:33] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('192.168.0.31', 10443)) shut down
2024-05-12 21:10:33,354::INFO::[_cplogging:213] [12/May/2024:21:10:33] ENGINE Bus STOPPED
2024-05-12 21:10:33,354::INFO::[_cplogging:213] [12/May/2024:21:10:33] ENGINE Bus EXITING
2024-05-12 21:10:33,354::INFO::[_cplogging:213] [12/May/2024:21:10:33] ENGINE Bus EXITED
webknjaz commented 3 months ago

Are you able to come up with a Cheroot-only reproducer / regression test that would run in our CI? I haven't used Windows for like a decade so can't test it locally but a CI failure can be inspected remotely, if the need be...

webknjaz commented 3 months ago

FWIW 10.0.1 contains some bugfixes that I thought should be backwards compatible. If 10.0.0 didn't fail, the regression is somewhere in the diff. Although, it's also possible that the problem existed before and got more evident with the patches around exception processing and logging..

webknjaz commented 3 months ago

dupe of #375?

ZoltrixGFC commented 3 months ago

Now starting to wonder if it was my reverse proxy causing the issue. I posted the following info on the SABnzbd thread:

In the errors I posted above, 192.168.0.26 is a docker host running nginx proxy. I had SSL enabled on SABnzbd running via the reverse proxy, and I noticed the SSL error listed above:

2024-05-12 20:36:21,625::INFO::[_cplogging:213] [12/May/2024:20:36:21] ENGINE Client ('192.168.0.26', 58176) lost   peer dropped the TLS connection suddenly, during handshake: (8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')
2024-05-12 20:36:21,628::INFO::[_cplogging:213] [12/May/2024:20:36:21] ENGINE Client ('192.168.0.26', 58164) lost   peer dropped the TLS connection suddenly, during handshake: (8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')

Last night I did a fresh install of SABnzbd, left SSL off, and set my other services to correct directly to SABnzbd (not via nginx), so far it has stayed up.

No idea if the reverse proxy could have been an issue at all?

Safihre commented 3 months ago

@webknjaz Seems indeed #375. This is not one that can just be caught like you implemented for the other problems?

webknjaz commented 3 months ago

That was my working theory and I started researching it by checking out what Python stdlib does. Though, I didn't get to test it. If someone wants to check out if such a patch would work, please do and report back.

P.S. This issue also claims hanging which is something I'd expect to be impossible in v10.0.1. So it'd be nice to unpack what it means actually..

webknjaz commented 3 months ago

2024-05-12 20:36:21,625::INFO::[_cplogging:213] [12/May/2024:20:36:21] ENGINE Client ('192.168.0.26', 58176) lost peer dropped the TLS connection suddenly, during handshake: (8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)') 2024-05-12 20:36:21,628::INFO::[_cplogging:213] [12/May/2024:20:36:21] ENGINE Client ('192.168.0.26', 58164) lost peer dropped the TLS connection suddenly, during handshake: (8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')

I added these log entries in the last release to make it cleaner what's happening.. Hoping to improve the formatting later.

webknjaz commented 3 months ago

No idea if the reverse proxy could have been an issue at all?

For that, I'd need to know how it's set up. The latest version shouldn't run out of threads and hang..

If you set it up to use plain http when connecting to the https port, it would obviously not work.

dguilloryjr commented 3 months ago

I have this issue happening on my device right now. I will post logs the next time it stops. I just turned on debug logging today...

thezoggy commented 3 months ago

seen a few others reporting this issue, so far it seems like its only been reported since we upgraded cheroot from 10.0.0 to 10.0.1 (cherrypy==18.9.0 been the same on both versions).

Most of the users seem to be on windows. This morning a Windows 11 Enterprise (build 22631) user reported the issue has happening, goes to sleep and wakes up the next day to find application broken. It was just idling, but there are a few apps querying every 30s to check on status.. so 4x hits every 1 min (two hits from each app). So nothing to excessive, but per logs you just see the hits every 30s until eventually cheroot throws the error:

2024-05-20 18:19:31,125::CRITICAL::[_cplogging:213] [20/May/2024:18:19:31] ENGINE A fatal exception happened. Setting the server interrupt flag to OSError(22, 'An invalid argument was supplied', None, 10022, None) and giving up.

Please, report this on the Cheroot tracker at <https://github.com/cherrypy/cheroot/issues/new/choose>, providing a full reproducer with as much context and details as possible.
Traceback (most recent call last):
  File "cheroot\workers\threadpool.py", line 119, in run
  File "cheroot\workers\threadpool.py", line 216, in _process_connections_until_interrupted
  File "cheroot\server.py", line 1364, in close
  File "cheroot\server.py", line 1484, in _close_kernel_socket
OSError: [WinError 10022] An invalid argument was supplied

Confirmed with user they are not using a reverse proxy or anything. Only thing change was they upgraded to newer version of our app which included newer cheroot. Asked user to downgrade back to previous version of the app so it downgrades cheroot to 10.0.0 so we can confirm issue goes away.. which he has done now. So waiting to see.

I tried to replicate it myself on window10 but I have not been able to. :(

Safihre commented 3 months ago

Since we got more complaints, I will 'fix' this for us by extending errors.acceptable_sock_shutdown_exceptions with OSError. I (somewhat) understand you first want a reproducer before catching this, but I have end-users that I need to keep happy πŸ˜