Send-only client websockets can't detect its peer's request for closure.

python-websockets / websockets

Library for building WebSocket servers and clients in Python

https://websockets.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

5.18k stars 514 forks source link

Send-only client websockets can't detect its peer's request for closure. #899

Closed vkottler closed 3 years ago

vkottler commented 3 years ago

Per: https://websockets.readthedocs.io/en/stable/_modules/websockets/protocol.html#WebSocketCommonProtocol.send

Canceling send() is discouraged. Instead, you should close the connection with close(). Indeed, there only two situations where send() yields control to the event loop:

The write buffer is full. If you don’t want to wait until enough data is sent, your only alternative is to close the connection.close() will likely time out then abort the TCP connection. message is an asynchronous iterator. Stopping in the middle of a fragmented message will cause a protocol error. Closing the connection has the same effect.

It seems then that you actually would have to poll recv with wait_for in order to bubble up the connection-closing exception condition. Why is that?

Currently you can just continue successfully awaiting send despite the peer clearly and detectably trying to close the connection. This is a bug.

vkottler commented 3 years ago

For example, I honestly don't see a way around a server -> client writer design pattern other than something like:

should_continue = True
message_queue = Queue()
register_message_queue_as_sink(message_queue)

try:
    while should_continue:
        try:
            messages = []
            try:
                # drain everything we can since we're in a polling mode
                while True:
                    messages.append(message_queue.get_nowait())
                except Empty:
                    pass

                for message in messages:
                    # allow null injection to provide a server-side, close-initiation mechanism
                    if message is None:
                        should_continue = False
                        break
                    await websocket.send(message)

                # try to read the channel to advance connection state (by getting an exception...)
                if should_continue:
                    await asyncio.wait_for(websocket.recv(), timeout=0.1)

            except asyncio.exceptions.TimeoutError:
                pass
            except websockets.exceptions.WebSocketException:
                should_continue = False
finally:
    unregister_message_queue_as_sink(message_queue)

To summarize:

We only want to send, and we implement our connection handler as one that drains a queue that's populated externally.
Instead of being in an optimal mode of operation where we get an item from the queue, determine if it's a valid message or a signal to cancel, we have to eventually await a recv on this iteration to advance the state of the connection.
As a result we have to now try to drain the queue as much as possible when we come back around because we just turned our completely asynchronous dispatcher into a synchronous polling loop because of the recv.

Am I missing something?

aaugustin commented 3 years ago

From https://websockets.readthedocs.io/en/stable/cheatsheet.html:

If you aren’t awaiting recv(), consider awaiting wait_closed() to detect quickly when the connection is closed.

For example you can do this:

    await asyncio.wait([
        function_that_sends_messages(websocket),
        websocket.wait_closed(),
    ], return_when=asyncio.FIRST_COMPLETED)

(There are many other patterns that achieve a similar effect.)

aaugustin commented 3 years ago

Coming back to your original message, you wrote:

Currently you can just continue successfully awaiting send despite the peer clearly and detectably trying to close the connection. This is a bug.

If you allow me to answer with the same level of assertiveness, you don't understand concurrency and cooperative multitasking.

The only reason why websockets wouldn't be able to process the closing handshake received from the peer — and then send() would raise an exception — is if you starve it by never yielding control.

Awaiting recv() when no message is received will yield control, which could put you under the illusion that recv() plays a role here. But it doesn't.

See #867 for details on this scenario.

vkottler commented 3 years ago

If you allow me to answer with the same level of assertiveness, you don't understand concurrency and cooperative multitasking.

I don't really understand that comment given that my issue somewhat duplicates #865 and #867 and is inspired by send(2):

The send() call may be used only when the socket is in a connected state (so that the intended recipient is known).

Your library provides this system-call analog but it provides a different abstraction, since you must advance connection state in-band (yes, in the same event loop, I understand that). This is not an inherent problem with event loops or asyncio, just the exposed semantics.

Thank you for the suggestion, I hadn't seen that in the documentation. I'm not dissuaded that this should be considered a bug, but I'm closing the issue as I'm not interested in the asyncio combat that I see in so many issues in this project that was unsurprisingly brought here. If I need adjustments to the library I will simply fork.

aaugustin commented 3 years ago

In case someone else reads this, for the record, "you must advance connection state in-band" really means "the library cannot advance its state if it is starved by a non-cooperative thread if a cooperative threading environment".

vkottler commented 3 years ago

In case someone else reads this, for the record, "you must advance connection state in-band" really means "the library cannot advance its state if it is starved by a non-cooperative thread if a cooperative threading environment".

Sorry, but no. The library will not advance connection state on send and therefore irrelevant and speculative calls must be made, to schedule additional tasks, and then both of those things need to race such that the caller now has to handle cleanup of the nominally unfinished-but-scheduled tasks or just leave the event loop in a bad state when stopping it or cleaning up program execution.

Additionally:

if it is starved by a non-cooperative thread if a cooperative threading environment

Event loops are single-threaded and not thread safe, so maybe you meant "starved by a task that does not yield"? But the problem with that is, we do yield, to send, but send does not yield correctly per the semantics of a send system call. This is the basis of this issue. Your comment suggests that this issue is opened on a false premise, which it is not.

This issue has nothing to do with managing an event loop or interfacing multiple threads. I genuinely don't understand what's confusing here. The issue as stated is valid and is solvable.

aaugustin commented 3 years ago

OK - maybe I misunderstood. You stated solutions that you found dissatisfying, and we started a discussion, but I'm not sure we agreed on the actual problem. Maybe let's start by agreeing on the problem?

The best description I'm finding in our discussion is:

It seems then that you actually would have to poll recv with wait_for in order to bubble up the connection-closing exception condition.

and:

I honestly don't see a way around a server -> client writer design pattern other than

Can you take a look at this example: https://websockets.readthedocs.io/en/stable/intro.html#browser-based-example Is the Python script what you mean by "a server -> client writer"?

If you start the server, open the web page, and then close the web page, I believe the ConnectionClosed exception is raised in the server. (Try it!) Is that what you mean by "bubble up the connection-closing exception condition"?

If you answered "yes" and "yes", then we have a trivial example that doesn't poll recv() anywhere, does bubble up the connection-closing exception condition, and doesn't use the kind of design pattern you suggested. So I'm confused.