thsmi / sieve

Sieve Script Editor
GNU Affero General Public License v3.0
763 stars 59 forks source link

WebApp websocket connection gets closed after 60 seconds on default nginx configuration #878

Open Smith4545 opened 1 year ago

Smith4545 commented 1 year ago

Prerequisites

What happened?

This problem is explicitly related to the usage of nginx in conjunction with the WebApp! On editing any Sieve-script via the WebApp, after not doing anything for a while, the user can't save his changes anymore. This obviously shouldn't happen. No matter how long the user doesn't do anything, he should always be able to save his stuff.

The reason for this behaviour is, that nginx, by default, will close a connection to a proxied server after 60 seconds if the proxied server doesn't transmit any data in this timeframe.

The probable solutions to this problem are:

What did you expect to happen?

No matter how long the user doesn't do anything, he should always be able to save his stuff.

Logs and Traces

2023-03-09 16:53:57 WARNING [handle_message] webserver.py : index out of range                                                                                                            
2023-03-09 16:53:57 WARNING [handle_message] webserver.py : Traceback (most recent call last):                                                                                            
  File "/opt/thsmi/sieve/sieve-0.6.1-web/script/webserver.py", line 65, in handle_message                                                                                                 
    handler.handle_request(context, request)
  File "/opt/thsmi/sieve/sieve-0.6.1-web/script/handler/websocket.py", line 48, in handle_request                                                                                         
    MessagePump().run(websocket, sievesocket)
  File "/opt/thsmi/sieve/sieve-0.6.1-web/script/messagepump.py", line 26, in run                                                                                                          
    data = server.recv()                                                                                                                                                                  
  File "/opt/thsmi/sieve/sieve-0.6.1-web/script/websocket.py", line 119, in recv                                                                                                          
    opcode = data[0] & 0b00001111
IndexError: index out of range

Which Version

thsmi commented 1 year ago

Ok this is an interesting one.

The webapp tunnels sieve messages via a http websockets.

Regular sieve which (not tunneled via TCP) requires a server to keep a connection open for at least 30 minutes. Because there where issues in the past and this the actual keep alive is way lower. It is currently at 5 minutes. Which looks like a good compromise. Because keep alive messages are rather heavy weight and sieve implementation tend to get upset if you fall below a certain threshold.

So to be safe with nginx default setting you need to be reduced to less than 30 seconds. Would be technically possible but it horribly scales. And it would generate lots of unnecessary load on the sieve server.

As far as I understood the nginx documentation a websocket ping does not extend the read timeout. So using a lightweight websocket based keep alive does not help.

So this basically splits down into three tasks.

  1. Update the Readme and add a note that you need to increase the timeout on nginx.
  2. Currently the timeout is hardcoded to 5 minutes. But this should be a parameter controlled by the admin. So that you can reduce it to 30 seconds or less if you are sure you infrastructure is ok with it.
  3. The WebApp currently lacks of a reconnect logic. It should automatically reconnect whenever the connection is lost.
Smith4545 commented 1 year ago

I had to read that three times, but what I got out of that for the moment is:

I'm testing with 600s and it seems to work for now. Thank you for having a look into this!