Brainstorming generic async write support for advanced/wrapper streams #2

pfalcon commented 5 years ago

This is continuation of https://github.com/micropython/micropython/issues/3396 .

Definition: miniprotocol is a simplified, "single-behavior" protocol. A real world protocol is usually a combination/composition of miniprotocols.

Classification of miniprotocols:

By record delimitation type: 1.1. Length-delimited. 1.2. Separator-delimited.

Example: full WebSocket protocol starts with separator (line) delimited HTTP protocol, then switches to length-delimited protocol.

The idea is that length-delimited protocols are more efficient. But actually better to say that they compose better - we always know how much is left to read for the current miniprotocol, and thus can avoid over-read. The only way to have the same guarantee for separator-delimited miniproto is to read byte by byte, which is known to be not efficient. Thus, the only performance-efficient way to deal with SepProto is to buffer the underlying stream (on our side). But if this buffering happens on miniprotocol parser side, then once we need to switch to another protocol, we may have overread data, and it's a new problem how to pass that data forward to the next miniproto parser. Also, while LenProto is not susceptible to over-read problem, it doesn't mean it doesn't need buffering. For example, there can be many small records, and e.g. read "5 bytes by 5 bytes" isn't much better than SepProto's "1 byte by 1 byte". These 2 issues: a) over-read issue if buffering happens on miniproto parser side; b) need for buffering for performance in general, lead to obvious and well-known solution: buffering should happen on the input stream's side. Well, that still not ideal, if we have efficient, large-block LenProto. Consider for example that some proto start with line-based miniproto, to handle which we allocate 256 bytes buffer, and then switches to LenProto with 1K blocks. Those 256 bytes are then wasted beyond the initial line-based "handshake" phase.

By exchange behavior: 2.1. Single client request leads to protocol transaction in one direction. 2.2. Single client request leads to exchange of records in both directions.

Example: TLS clearly consists of initial handshake phase which requires exchange, but once that is done, single user write leads to single TLS record to send to peer, and single user read is backed by single record received from peer. One way to deal with this scenario is to perform complete handshake based on user request "open connection", so we had one-directional exchanges afterwards. But it's more complex if we postpone handshake until first actual read/write operation from user. And actually, it's more complex on TLS side too: at any time, a negotiation may be requested, which leads to the same issue that a user single operation may lead to the need to postpone it and perform series of exchanges.

Push vs pull handling 3.1. We can "push" any given data into miniproto parser. 3.2. Or we can let it "pull" data, specifying how much exactly it would like to get at the next step.

Push approach has another face of over-read problem: a parser might not accept all data we feed into it, yet produce some output already. This leads to "all hands are full" situation: we need to keep track/store input data which the parser yet not accepted, and need to deal with parser output, which me may not be able to accept in one go on our side either. Pull approach simplifies that somewhat. In reality, suitability push vs pull approach depends on the nature and direction of data passing. For application data -> convert to protocol data, push approach is obvious: the app already have that data (for classically written apps), so we can only push it, hold the remaining part which miniproto didn't consume, and iterate on sending out miniproto output until it's done and we can continue with feeding the rest of app data. On protocol data -> convert to app data, pull approach is beneficial: miniproto parser knows either how much exactly it needs, or otherwise, its internal buffer requirements, and can request data to avoid over-read.

Transformational vs non-transformational 4.1. Transformation miniprotos actually change application data when converting them to protocol records. 4.2. Non-transformational may frame app data, but the data itself is intact.

For example, WebSocket in one direction "masks" the data, but in another direction it's verbatim. Distinction here is of optimization nature: we may use "zero copy" operation, i.e. write user data directly to the underlying stream, instead of copying it to miniproto buffer for additional processing.

pfalcon commented 5 years ago

Implications so far:

Given push vs pull discussion above, the async "read" path for wrapper streams can reuse the code for sync streams, if that code is written to be able to handle EAGAIN from the underlying stream. That effectively means it should be written as a state machine, handle "number of bytes left to read to move to next state", etc. Well, that's for example how "websocket" module is written, and why it's reused in uasyncio.websocket.server. With "ssl" situation is less bright. Even in app data miniproto, axTLS had a bug which precluded "ssl" to work for async reads. It's now fixed and that works. But there's still handshake miniproto, and axTLS doesn't really have detailed enough state machine for that to work fully in async way.

pfalcon commented 5 years ago

Implication no. 2, re: Transformational vs non-transformational. So, if that optimization is to taken into account, then attempts to define some general meta-interface to implement some common proto handlers for both sync and async probably won't work. Instead, each one may need to written in more or less adhoc way, which however will allow to optimize it for traits of a particular protocol.

pfalcon commented 5 years ago

Thoughts re: handling handshake-like miniprotos. It's unclear what's the best to deal with these.

One good way is to be able to have detailed codes like NEED_READ, NEED_WRITE. I.e. if user calls write(), it may get NEED_READ instead, so that write goes to backlog, and user may need to queue a read operation on the underlying stream (data from which will likely be consumed internally by proto). Likewise, if read() is called, NEED_WRITE may be returned instead, notifying that read should be backlogged, and proto handler instead prepared a write record to spool to the underlying stream.

But in the current Python stream API, there're no separate NOT_READY vs NEED_READ vs NEED_WRITE return codes. There's only one - None, which originally means NOT_READY, and would need to be reused for 2 other codes. Then one would need to have additional means to query the more detailed state. For read that could be:

res = wrapper_stream.read()
if res is None:
    # See if wrapper actually asks us to send out something on its behalf
    if wrapper_stream.write_pending():
        ...

Obviously, this approach is less efficient than directly returning needed code from read(), and it's not clear how to apply that to .write() and NEED_READ, except in fully symmetrical way, by adding .read_pending() or a similar predicate.

pfalcon commented 5 years ago

Otherwise an obvious way to deal with async write path would be:

For a proto module to prepare a protocol record in a memory buffer. This can be either its own buffer, or in a buffer injected into it.
Instead of actually writing it to a stream, well, return NEED_WRITE indication.
Async adapter handles sending that buffer over async stream.

Given comment in https://github.com/pfalcon/micropython/issues/20#issuecomment-445541359 , this can't be made a total pattern, for non-transform miniprotos, protocol record is to be constructed/send piecewise with verbatim user data.

pfalcon commented 5 years ago

Also, worth stating explicitly that all operations on async wrapper streams should be optimistic, i.e. client should always first try to perform a read/write operation, and only if EAGAIN is received, should poll (the underlying) stream. That's because wrapper may have its own buffer, which may have data/space for new data, regardless of the readiness state of the underlying stream.

pfalcon / pycopy

Brainstorming generic async write support for advanced/wrapper streams #2 #20