python-hyper / wsproto

Sans-IO WebSocket protocol implementation
https://wsproto.readthedocs.io/
MIT License
267 stars 39 forks source link

support python 2? #22

Closed njsmith closed 7 years ago

njsmith commented 7 years ago

I just heard from @agronholm that he would like to see python 2 support restored, so here's an issue to centralize discussion.

First, the history: py2 support was dropped in #11, which was an overly-large pull request mostly focused on rewriting the websocket frame parser in wsproto/frame_protocol.py. The basic challenge is that (a) we need to incrementally parse frames that might trickle in over time, and (b) the websocket wire framing is somewhat complicated, with multiple conditionals, and we don't want to buffer large messages in their entirely. So the underlying state machine has a fair amount of state and logic.

Originally, this was handled with an open-coded state machine that used object attributes to encode state in an ad hoc way. It mostly worked, but was pretty hard to reason about (for me), had unbounded memory usage when processing large messages (which is bad), and had a number of bugs (I don't remember the details, but there's a partial list in #11).

I rewrote it to internally use generators to implicitly manage the state machine as local variables on the stack, so now it's just straight-line parsing code with control logic expressed using if and while, like you'd see in a program using blocking I/O. And whenever it hits a point where it needs to receive more data before it can continue, it suspends the frame and waits for it to be delivered.

The downside is that in py2 you can't factor a generator into multiple subgenerators; this needs yield from :-(. So that's where the py3 requirement comes from.

I have no particular objection to supporting py2, I just don't want the code to become buggy or unmaintainable :-). So if someone wants to takes this on, then I'd suggest starting with what we have in mainline (rather than the code as it looked before #11), and then trying to figure out a strategy to maximize readability within py2's constraints.

This might mean an explicit state machine, maybe using a tool like automat? Or folding the current code into one giant function? (Maybe not too bad, if we move the consume_at_most and consume_exactly out into the runner?) Or something else?

Lukasa commented 7 years ago

I think an explicit state machine (or maybe several, depending on the architecture) is the way to go here: it served hyper-h2 very well (and I'll say right now that I think that HTTP/2 is a more complex protocol than WS). One note though: automat has a tendency to enforce what I failed to do in hyper-h2, which is to ensure that all state is included in the state machine, rather than having a state machine state plus some miscellaneous variables.

In general though, state machine-based implementations tend to be the easiest to reason about.

njsmith commented 7 years ago

@Lukasa: it turns out websocket has a much more complex low-level record layout than HTTP/2. For reference, the h2 equivalent of the code we're talking about here is these 10 lines. By comparison, here's the websocket framing diagram – notice the extra ifs, plus for websocket we can't use h2's trick of retrying from the start until we have a full frame, because sometimes we need to fragment frames as we parse them; and in the process the state we need to track includes stuff like incremental deflate and utf8 codecs (any given frame might have either, both, or neither). It's not hard but the bookkeeping is surprisingly complicated.

A state machine may well be the right solution; I haven't tried it :-). But for this kind of incremental parsing the yield from solution is really nice. I totally agree that explicit state machines are the way to go for the kind of protocol state you're thinking of, but this is different enough that I'm not sure whether or not that general advice applies.

Lukasa commented 7 years ago

@njsmith So the question boils down to whether or not we're dealing with strictly a parsing problem or not. For parsing, I continue to believe that the "complete parse or abort" approach has the simplest behaviour, and is the best starting point. This approach is used by picohttpparser in the C world, and is also the approach h2 uses with hyperframe. By a "complete parse" I mean that we have parsed a single logical data unit to termination: that is, the frame header and complete frame body are present.

I can't speak to fragmentation or decompression, but I think that those problems there can be solved by treating all of them as "higher level" concerns. Put another way: the data emission events do not need to correspond 1-to-1 with frames, but they can if needed. That allows wsproto to pick the easier of two options: either turn each frame into a data emission even that preserves the low-level signals about fragmentation, or internally buffer fragmented data along with whatever metadata is required to ensure that fragments are pieced together appropriately and without violating the specification. That bookkeeping can be handled in state machines if needed: for example, a "fragmented data" state machine can be used to process frames for fragmentation information and also to control buffering data fragments.

I totally appreciate that the yield from solution is nice, but if we can't have it then we need to fall back to a world of functions and state machines. In general the simplest first approach is to write a "one-shot" parser that parses to completion, and then use state machines for cross-frame state management. If that ends up being too slow due to the need to repeatedly backtrack, then it's worth looking at a stateful parser that can do incremental frame parsing. However, my experience has been that it's very rare to find a data flow so pessimal as to hurt the one-shot parser.

njsmith commented 7 years ago

Addressed by #23