Add a seperate thread for Pinging the connected clients

miguelgrinberg commented 2 years ago

Thanks!

There are a few problems with this change that I hope you can address:

The ping() method uses a busy loop to determine when to send a ping packet. This is not a good solution, as it will use excessive amounts of CPU.
You have not implemented any logic to close the connection when a pong packet isn't received before it is time to send a new ping.
Using a separate thread for the pings is okay I guess, but I would have hoped there was a way to incorporate this into the existing thread, which already has a wait loop in it.
I think the ping/pong feature should also be enabled for the client, given that you have implemented this in the base class that is common to server and client classes.
(minor one) You have a print statement in the ping loop that needs to be removed.

sanhardik commented 2 years ago

Hello, I have implemented the closing of connection if ping response is not received and I also enabled it for Client.

I tried implementing it in the same thread but couldnt get it work. When I tried testing it, it would wait for the first message before starting the ping and was not sure if it was the right way of implementing it. I was hoping if you could make the required changes.

pfw commented 2 years ago

@miguelgrinberg I was just having a look at this and wondering if a clean place to support this might be in around https://github.com/miguelgrinberg/simple-websocket/blob/c7fc3edf3194d8600ca43ee0e0aa7f0067c7132a/src/simple_websocket/ws.py#L82 where the 'receive' is waiting for some input from the other side (if I'm reading the code correctly).

Since pings don't have to be always sent (ie. if you are sending data back and forward in less than what would be the ping interval the ping/pong is irrelevant) we could have a timeout set to the ping interval and if nothing appears before the timeout the ping is sent, getting data would reset the need to send a ping.

Sound reasonable?

miguelgrinberg commented 2 years ago

@pfw there are many applications where the server sends, but does not receive. This is called server-push and is one of the main reasons why people use WebSocket. So I don't believe your approach is a sound one. It may be an okay approach to use at the application level for applications that depend on receiving data from clients, but it is not a general solution that can be part of this package.

Since pings don't have to be always sent

I disagree with this. The main reason for the ping/pong mechanism is to detect when the other side is unresponsive and end the connection. This needs to happen at regular intervals, regardless of other traffic present in the line.

The secondary reason to use ping/pong is to prevent timeouts. This is achieved by selecting a ping interval that is 20 or 25 seconds, which helps avoid the 30 second timeouts in many platforms.

pfw commented 2 years ago

By putting this on the receive you don't actual need to have anything be received by the server, with the wait it gets into a loop with the ping interval but yes you do need to have a while True: receive loop happening on the server at some point.

From experience with large scale web sockets along the lines of what you are talking about it is unusual to not at least allow for the server to receive some message at all - often it can even be just a 'subscribe' message that comes in and the server picks a stream of data that it needs to send or a 'hello' message with auth details as cookie auth on web sockets doesn't work everywhere.

Basically I think it would be unusual to never have that receive loop happening - but I guess it could.

pfw commented 2 years ago

Thinking about this a bit more, given the nature of threads you are going to have the put the server side of the connection into some kind of loop or it will fall of the end and return..

@sock.route('/echo')
def echo(ws):
    while True:
        data = ws.receive()
        ws.send(data)

without that ws.receive in the while True loop you'd need something else to block the thread of the request otherwise it just ends. You could do that with a timer or similar but it needs to be in some loop since it is being driven from the decorator in websocket_route(...).

The appeal when starting to play with flask-sock was the simplicity of that loop in the handler having come from tornado and similar event based servers. For me, given my understanding/experience of websockets, it is worth the ease of use to accept that the function is the driver so it needs to behave in a certain way to work.

Or given the underlying thread for the websocket is already in a blocking loop on sock.recv you fire up another thread for the ping and use timeouts on an Event and then just wear the extra thread.

Either is fine but just has one of these limitations:

always have a receive loop running
have another thread running (goes from 2 to 3 for a client so not the end of the world but definite a cost).

Thoughts?

miguelgrinberg commented 2 years ago

@pfw There is two different things that I think you are mixing. If the issue you are trying to address is to prevent a connection close from being idle, then this can be solved in an extremely simple way at the application level. See how I changed your example above to do it:

@sock.route('/echo')
def echo(ws):
    while True:
        data = ws.receive(timeout=25)
        if data is None:
            ws.send('ping!')
            continue
        ws.send(data)

I feel we are arguing a moot point, because you are trying to defend the idea that all (or most) applications are going to call receive() in a loop, and this is completely unrelated to the ping/pong mechanism, which is intended to detect when the other side becomes unresponsive (and helps with preventing idle disconnections as a side effect).

Your solution hijacks ping packets and uses them only to prevent idle disconnections, and only in applications that block to receive data from clients, and only when the application does not use the timeout feature of the receive function. It's not a horrible solution by any means, it just covers the type of applications that you build. But it's not generic enough to be included in this package.

The ping/pong mechanism is an internal aspect of the WebSocket protocol that should happen completely outside of the application's control, in my opinion. This PR sort of achieves that, but still has a couple of serious problems.

pfw commented 2 years ago

Hi @miguelgrinberg.

I think there is a bit of talking past each other ;-)

Definitely not arguing for one particular thing here, well I guess not having an extra thread but more just looking at the different ways it can be done and still keep things simple.

I definitely understand the ping/pong relationship and it's not about hijacking it, rather where do you put it.

Your example above is the same as the one from the docs in as far as there being a receive in an infinite loop and is the reality of being on a thread, you must loop with a blocking call somewhere or you go away. Without that while True call the handler would finish straight away, the receive(timeout=25) just makes it a non-cpu busy loop.

My position is that if you are going to have a receive with any kind of timeout (1s or None) you can combine that with a ping interval to get the correct behaviour in the receive method. But I don't see that you can get away from having some kind of receive loop in the handler. You can avoid getting real values on it and I've built exactly the kind of mainly server send you mentioned but you must block.

So given that loop exists, can the receive be used in combination with timeout values on the Event to get a spot to put the pings without requiring an extra thread. That is what I'm heading towards, can the timeout values be manipulated so that you return the app in the timeout it requests and timeout internally for a ping to be sent if need be.

Now if the handler is made to work without a while True loop, then that opportunity goes away and it has to be another thread.

Just the tradeoff between the two things - I hope the that is clearer where I'm coming from.

miguelgrinberg commented 2 years ago

@pfw as I said before, as a general solution, implementing this in receive does not work. The ping packets should (ideally) be sent at regular intervals, let's say every 25 seconds. There is no way to do this if you rely on the application calling receive. I understand what you are saying regarding needing a loop, but lots of applications will want to wait on some other conditions inside the loop instead of on receiving data from the client. This is the point I'm trying to get across. If your application needs to wait for data from clients, then great, you can do something like this (if you are not concerned about the regularity of the pings, which I am). But for the general case, this does not work.

pfw commented 2 years ago

Understood, it is unfortunate that there is another thread involved.

It might be worth noting somewhere in the docs potentially that if the receive isn't called clients can fill the servers input_buffer by sending messages in when no receive is draining them. Of course lots of ways clients can do nasty things to servers but as a general rule unless someone really knows what they are doing behaving exactly like the example in the docs is very reliable and straightforward method, certainly the appeal as I see it. Things get tricky quickly outside that pattern.

I guess that is my point, the pattern in the docs is excellent for a straightforward websocket connection with relatively few connected clients (hundreds), once the app gets bigger than that it might be worth going to event driven fully given the thread load, alternate models of coding handlers etc. One package isn't going to be the right size for everything and it's very nice to have an option that works for a lot of cases within Flask.

miguelgrinberg commented 2 years ago

It might be worth noting somewhere in the docs potentially that if the receive isn't called clients can fill the servers input_buffer by sending messages in when no receive is draining them.

Yes, this makes sense. This can be done by adding this snippet inside your loop:

    # drain the input buffer to prevent overflows
    while ws.receive(timeout=0) is not None:
        pass

pfw commented 2 years ago

One reason I'm focussed on the received loop is past (painful) experience with large scale websocket environments - when the servers starts doing things which take time everything can blow up badly. Sending an email slowly or talking to a db which takes a long time within python can break the flow, anything that is expensive can affect the ping/pong flow since it has to flow through python at some point. Keep things in that very nice simple loop and things can run smoothly.

I'm pleased that my needs can be met with this as it stands :-)

pfw commented 2 years ago

I had a thought about this and did some experimenting - you can have pings at regular intervals without requiring an extra thread per socket, have a single thread to schedule all pings.

It goes like this:

There is a Pinger() class with a ping interval
It is created by whatever is using simple_websocket, eg. flask-sock in init_app
When a Server (or client) is created it is registered, pinger.register(ws)
the pinger sits in a loop controlled by Event.timeout with a value of the next ping deadline
if no pong was received by the ping a ping should be sent, shut the ws down and de-registered

I think that would satisfy your requirement for regular interval regardless of model of the client/server waiting and also means just a single extra thread for the whole app.

    def init_app(self, app):
        """Initialize the Flask-Socket extension.

        :param app: The Flask application instance. This method only needs to
                    be called if the application instance was not passed as
                    an argument to the constructor.
        """
        if self.app is None:
            app.register_blueprint(self.bp)

        self.pinger = Pinger(interval=3)

and


        def decorator(f):
            @wraps(f)
            def websocket_route(*args, **kwargs):  # pragma: no cover
                ws = Server(request.environ)
                self.pinger.register(ws)

Outside of those two changes it is all within simple-websocket and the same can be done with a client connection, possibly using a helper function to wrap the two steps.

Thoughts on that?

miguelgrinberg commented 2 years ago

@pfw one of the issues remaining in this PR is to eliminate the extra thread. The solution in this PR, when properly implemented would require no additional threads. Your solution would work, but sending all pings at the same time is less ideal than sending them at random times for each websocket, because a large number of pings sent all at the same time would create CPU usage spikes at the selected ping interval.

pfw commented 2 years ago

@miguelgrinberg The pings will only be sent at the correct intervals, I've got a prototype working but needs refinement on how to handle closing the socket when a pong isn't received so that the close exception is raised in the correct place.

Registration does this (with a collections.deque):

    def register(self, ws):
        self.deadlines.append((ws, time.time() + self.interval))

and the thread runs through that which tells it how long to wait for the next send and after that how long to check for a pong being received.

Basically it's a loop where the Event timeout value is the duration until the next deadline.

Does that make sense? I'll have some time today to refine so you can see the whole thing.

pfw commented 2 years ago

On top of that once the deadline loop is in place it would be easy to randomise the interval and that is something I've done in the past to deal with many 100k's of clients reconnecting after a server goes down for any reason. But in that case you will generally need help from a load balancer as well.

pfw commented 2 years ago

@miguelgrinberg Are you happy with the need for the extra setup in flask-sock to enable the pinger?

miguelgrinberg commented 2 years ago

@pfw I'm sorry I don't really understand what you want to do. But you seem to be ignoring the fact that the solution being worked on here when completed would require no additional threads.

pfw commented 2 years ago

Ok, if you can make it work with no extra threads that would be great but I don't suspect that is likely given the way the handler and socket threads are interacting with various blocking calls. How do you get the example working where the handler thread is blocking on receive indefinitely and the socket thread is blocked in recv indefinitely until the other end sends data?

The beauty of Hardik's first pass is that the handler can be done in any way but has the cost of an additional thread per socket (cpu load being relatively easy to solve).

My prototype just changes that to use a common thread for all pings. This gets all pings going at regular intervals (with jitter through random delay if required) on a single thread for all connections. No clumping of pings being sent and the handler can be written exactly as it is now in the example or any combination of other blocking loops.

pfw commented 2 years ago

@miguelgrinberg Have a look at these changes which is hopefully clearer than talking about.

single thread for all pings
initial ping sent at random to apply jitter
connection close exceptions raised on .send, .close, .receive in handler
no busy loop checking for sockets to ping

https://github.com/miguelgrinberg/flask-sock/compare/main...pfw:feature/ping?expand=1

https://github.com/miguelgrinberg/simple-websocket/compare/main...pfw:feature/ping?expand=1

This handler works as expected:

@sock.route("/echo")
def echo(ws):
    while True:
        try:
            data = ws.receive(5)
            ws.send(data or "no data")
        except ConnectionClosed as e:
            print("print connection was closed", e)
            break

It will work regardless of the blocking looping style in the handler - on the next send or receive after pong failure you get a close.

I'm happy to work that up to a clean pull request or just leave it there as reference.

I'd be interested to see how you go with no threads at all but this is available as a alternative.

miguelgrinberg commented 2 years ago

In the end there wasn't a lot I could reuse from this PR, but I have committed the ping/pong support and will be testing it over the next few days before a release. Thanks everyone!

pfw commented 2 years ago

Nice one, another corner of the standard library to keep in mind for the future.

miguelgrinberg / simple-websocket

Add a seperate thread for Pinging the connected clients #7