python-trio / trio-websocket

WebSocket client and server implementation for Python Trio
MIT License
70 stars 26 forks source link

Clean up state machine #90

Open mehaase opened 5 years ago

mehaase commented 5 years ago

There are a lot of complex code paths for closing a websocket connection, mostly because the first implementation was overly simplistic and I kept hacking away on it to handle each new problem that arose. This could be cleaned up and made a lot more readable.

mehaase commented 5 years ago

Brain dumping some of the issues that make the close code complex:

There are multiple stages of closing. We can say that CLOSING is when we are asked to start the handshake and CLOSED is when the underlying stream is torn down. (Or maybe not—see below.)

Wsproto and WebSocketConnection have different meanings of "closing" and "closed". Wsproto is CLOSING when it writes a close frame to its internal buffer. The websocket state is CLOSING when it somebody calls aclose() or a ConnectionClosed event is received but the response is not yet sent. Wsproto is CLOSED when it has sent and received a close frame. WebSocketConnection is closed when the websocket handshake is complete and the underlying stream has been torn down. We probably should not rely on wsproto's state property at all. Instead, we should observe state from events (like ConnectionClosed) and infer state from sequences like wsproto.close(); self._write_pending().

There are two layers to the protocol: the application layer (WebSocket) and the stream layer (i.e. typically TCP). We generally want to close the application layer with a handshake before closing the stream, but sometimes that's not possible, i.e. if the underlying stream is torn down before the handshake occurs.

_I wonder if tearing down the stream inside WebSocketConnection is even the right thing to do? It might be surprising for somebody who calls wrap_client_stream(stream) and finds out later the stream was closed... maybe they wanted to reuse that stream for some other protocol? We could define WebSocketConnection closed state as when the handshake completes, then put the responsibility for stream teardown somewhere else, i.e. inside APIs like open_websocket(…). These APIs are already responsible for creating a stream, why shouldn't they be responsible for tearing it down?_

The local and remote endpoints each have a state. E.g. if we close with code 4001 then we send a frame with 4001. The server completes the handshake by sending back a frame, but it can send back whatever code it wants, i.e. 4002. Should the local close code property show the close code we sent or the close code we received... or both?

Closing behavior is different depending on which side initiates. E.g. if the remote endpoint initiates closing, then we keep the internal message channel open until the user has read all of the messages queued in it. If the local endpoint initiates closing, then we close the receive channel and the user cannot read any more messages.

We have to be careful not to block the reader task. One mistake I made several times is trying to call self.aclose() from an event handler. This never works because all of the event handlers run inside the reader task and aclose() waits for the connection closed event, so the reader task is deadlocked. There needs to be some method for internal close that is safe to call from the reader task.

The next step is to try to draw a diagram to identify all of the possible ways the connection can close. This should help me understand how to rewrite this.

mehaase commented 5 years ago

I'm going to broaden the scope of this to look at the entire state machine in the library. I'm sure there are bugs (or undefined behavior) in certain spots, e.g. what happens if you call accept() and reject() on the same request? You'll probably get an obscure wsproto error.