Consider timeouts - Githubissues

mehaase commented 6 years ago

Preface: I don't know if this issue will be resolved by writing code, writing docs, or both. I'm starting this thread just to document some of the issues around timeouts and potential solutions.

This library does not apply any timeouts on network operations. The Trionic approach to timeouts focuses on the caller's responsibility, and the timeout system is largely built around the principle of making timeouts composable and easy to reason about. This principle is violated by passing timeout= or deadline= arguments throughout the API.

However, pushing this responsibility onto the caller can be tricky in a library like this one. Take this simple example:

async with open_websocket_url('ws://my.example'):
  msg = await queue.get()
  await ws.send_message(msg)

This snippet hides three important details:

Waits for the open handshake to finish before entering the block.
Waits for the close handshake to finish before exiting the block.
Waits for socket close before exiting the block. (The RFC says the server must close the socket, but the client may close the socket after waiting a reasonable amount of time.)

Both of these operations could timeout, especially if the peer is malicious and leaves the handshake half-finished. But where can the caller place timeouts if they want to timeout the websocket handshakes but not waiting for an item from their internal queue?

@belm0 proposed the following (lightly edited):

with trio.move_on_after(CONNECTION_TIMEOUT) as cancel_scope:
    async with open_websocket_url('ws://my.example'):
        cancel_scope.deadline = inf
        msg = await queue.get()
        cancel_scope.deadline = trio.current_time() + REPLY_TIMEOUT
        await ws.send_message(msg)

This works and does not require changes to the library, but it isn't ergonomic. We want to make it easy to write safe code.

@njsmith raised the idea of a special decorator for context managers:

async with limit_entry_exit(open_websocket_url(...), entry_timeout=..., exit_timeout=...):
    ...

This has better ergonomics but is still another hurdle for a developer to get safe behavior.

Finally, I proposed adding timeout arguments to the library so that timeouts can be applied automatically to the opening handshake, closing handshake, and teardown. This is at odds with the Trio principles described in the blog post, but I think it might be worth breaking norms if it results in code that is safe by default. I really like Trio's composable timeouts, but connection set up and tear down don't seem like "composable" steps.

I'm not sold on any one of these approaches, just wanted to open an issue to keep track of it. More good discussion over here:

mehaase commented 6 years ago

After thinking about this for a while, I think it's okay for the high level APIs (the context managers) open_websocket() and open_websocket_url() to have embedded timeouts:

The CMs are supposed to be simple and safe APIs. If timeouts are not built-in, then they won't be safe. Users can use the lower level APIs for more complicated scenarios.
It's tricky for the user to compose the CMs with their own timeouts. See belm0's example in the first post.
Trio's approach to timeouts focuses on making them composable and easy to reason about, but I don't imagine many users will want to compose opening or closing a websocket with some other operation…
…except for this common case: open connection, send/receive some messages, then close connection. This case is easily handled by wrapping the entire async with open_websocket(): block in a timeout.

No other client operations need a built-in timeout, like pinging or getting a message, because the caller can compose those operations with their own timeouts.

The server also needs built-in timeouts, for the same reason that the high level APIs do: the server hides some I/O that is difficult for the caller to compose with timeouts, e.g. the opening handshake and automatically closing the connection when the handler returns.

So here's a tentative plan:

Add open_timeout and close_timeout arguments to both open_websocket() and open_websocket_url() with reasonable defaults. They can be disabled by passing math.inf. They can be enforced inside of open_websocket() without touching any of the WebSocketConnection code.
Add open_timeout and close_timeout arguments to serve_websocket() and WebSocketServer.__init__() with reasonable default that can be disabled by passing math.inf. These can both be enforced inside WebSocketServer._handle_connection() without touching any of the WebSocketConnection code.

The documentation will have examples of timeouts for both the high-level API and the low-level API.

mehaase commented 6 years ago

Closed via #89.

python-trio / trio-websocket

Consider timeouts #64