When building Channels on top of a bidirectional Stream, how should we handle closure?

I came across a slightly tricky implication of the structure of Trio's stream ABCs today.

Suppose I'm implementing a channel that sends objects over a byte stream. Since I'd like to be able to use it on a unidirectional transport, for maximum flexibility I'll define StreamSendChannel to wrap a SendStream, and StreamReceiveChannel to wrap a ReceiveStream. All good so far.

Now I want to use these stream channels to send objects in both directions on a bidirectional transport that I have a Stream for. The subtyping relationships say any Stream is-a SendStream and also is-a ReceiveStream, so it looks like I can do this by wrapping the same Stream in two channels, one for sending and one for receiving. This works fine as long as the channel stays open, but since the two channels are independent, they can close independently, and then they clash over the calls they want to make to the stream's aclose(). The problem is that Stream.aclose() doesn't start meaning only SendStream.aclose() if you interpret the Stream as a SendStream -- it closes the receive side too.

I think this might be less confusing if Stream did not inherit SendStream and ReceiveStream. It could still define the same methods, and we could provide a helper for turning a Stream into a SendStream plus a ReceiveStream, that encapsulates the logic of:

send side aclose() calls send_eof() if impemented
receive side aclose() calls shutdown_read() if implemented (https://github.com/python-trio/trio/issues/823#issuecomment-459104932)
underlying stream aclose() only called when both send and receive sides have been closed
even if the underlying stream doesn't support a certain type of half-close, still provide its local semantics (by raising ClosedResourceError out of current and future operations)

This helper doesn't necessarily need to be provided by trio, and would be useful even if we don't change the inheritance hierarchy. I think it's probably not worth including in trio unless we change the inheritance hierarchy, so this issue is mostly about that question.

Ah ha! You have discovered why v0.2.0 was delayed so long :-).

I struggled this question a lot I was designing the abstract interfaces, and I think Trio's approach is the best available, but it does have trade-offs. So, the reason we have the hierarchy AsyncResource → SendStream, ReceiveStream → Stream is exactly so we can properly model this closure issue you ran into. For most actual streams (in particular TCP and TLS, which are overwhelmingly the most important), sending and receiving are independent operations, but closure affects both halves. So that's exactly what our interface hierarchy says.

A Stream is a subclass of ReceiveStream and SendStream in the Liskov sense – you can use it as either. But you're right, you can't use it as both :-). But the hierarchy doesn't claim otherwise. Imagine if you had an ordered-dict class that implemented both "mutable sequence" and "mutable mapping" interfaces. You wouldn't expect that you could "split the dict in half" and mutate it using the sequence API in one function while mutating it using the mapping API in another function, without them interfering with each other. But with streams it feels weirder because they're almost two separate halves... but they just... aren't.

We already have a way to describe an object that really is an independent SendStream and ReceiveStream.... it's a StapledStream :-). (I guess I should link to this subthread about shutdown_read here too, because it's relevant – in particular I described why shutdown_read isn't part of Stream.)

(Interesting trivia: in the asyncio streams interface, if you have a bidi stream it's always represented as two objects, a StreamReader and StreamWriter. So how do they handle the shared closure state? StreamReader simply has no close method; StreamWriter.close closes both of them. It's... not how I would have done it.)

Anyway... that's all background, I think, and the real question here is what to do with channels-on-top-of-streams. It's a really interesting question! Most protocols you build on top of streams are either intrinsically unidirectional or intrinsically bidirectional... channels are an interesting example where you have the same set of unidirectional/bidirectional options at the public API as you do on the transport layer, so you have to figure out how the sort-of-but-not-entirely-separate semantics of stream relate to that.

Given the annoying fact that common transports don't have unidirectional close, it's not actually obvious to me that the best way to build a bidi Channel on top of a Stream (say, TCP), is to implement a SendChannel on top of a SendStream, and a ReceiveChannel on top of a ReceiveStream, and then staple them together. Of course makes sense to have unidirectional SendChannel that works over a SendStream or Stream, and to have a ReceiveChannel that works over a ReceiveStream or Stream, respectively, but trying to get both of them working over the same Stream at the same time seems messy.

I feel like what you might actually want to do is implement a single bidi protocol over a bidi Stream, and then expose it in one of two ways:

As independent SendChannel and ReceiveChannel objects that internally have some shared state they use to coordinate sharing the connection. They'd use protocol level messages to manage the unidirectional close state, and then when both are closed they close the underlying transport. Note that this means ReceiveChannel.aclose has to send a message on the underlying transport. And that has some tricky flow-control implications, because it means if the SendChannel side is clogged then the close message might get stuck, so you might need some SSH / HTTP/2 flow control... ick that's a bit complicated. Actually, you might be better off first using a multiplexed stream protocol like SSH or HTTP/2 to split a Stream into a real SendStream + ReceiveStream (!), and then layer unidirectional channels on top of the unidirectional streams.
As a single bidirectional Channel, that like Stream has both send and receive methods, but only one aclose method that closes both sides. AFAICT that's what most real channel-like protocols (e.g. websocket) actually do. Plus mayyyybe send_eoc method too? And I'm not sure how any of this would interact with cloning... hmm. I guess the intuition would be that a Channel is just used for different things than SendChannel/ReceiveChannel (i.e., interactive protocols versus producer/consumer), and you should just pick the one you want, and it will have the cloning semantics you want.

A piece of hard-won meta-advice: remember that the goal isn't to provide every mathematically plausible building block and allow fully arbitrary mixing-and-matching (as tempting as that is!); it's to provide a usable set of tools that people can actually understand and that are well-matched to the problems they encounter in practice. Does anyone actually need to multiplex a fully-independent SendChannel and ReceiveChannel over the same Stream? It seems like most people could either get by with not-quite-independent channels, or else use two streams.

python-trio / trio

When building Channels on top of a bidirectional Stream, how should we handle closure? #895