reactive-streams / reactive-streams-jvm

Reactive Streams Specification for the JVM
http://www.reactive-streams.org/
MIT No Attribution
4.81k stars 531 forks source link

Polyglot Support #45

Closed benjchristensen closed 9 years ago

benjchristensen commented 10 years ago

I suggest expanding this initiative beyond the JVM since most of us need our data streams and systems to interact over network boundaries with other languages.

Thus, it seems it's actually more important to define the protocol and contract and then allow each language platform to define the interfaces that meet it.

Perhaps an approach to this is breaking out into multiple sub projects such as:

Even if the focus in the short-term remains on the JVM interface design, we would gain a lot by including communities such as Javascript/Node.js, Erlang, .Net, banking and financial trading (who have been doing high performance messaging for decades). It would also make the model far more useful as we could then consume a reactive stream from Javascript in a browser via WebSockets to powered by Netty or Node.js receiving data from Rx/Akka/Reactor/whatever and it would "just work".

jbrisbin commented 10 years ago

I'm currently working on integrating Java 8's Nashorn Javascript engine with Reactor and this idea of extending Reactive Streams into other languages is very much on my mind.

There's the possibility of providing Reactive Streams--the API--in Javascript for use in server-side and client-side code to get the same characteristics as defined in the spec, and there's the cross-network-boundary possibility of communicating from one Reactive Stream to another over a network. As long as the interaction is clearly defined, I don't think the transport and protocol (or combination of the two) really matters.

e.g. one could conceivably use plain HTTP + Javascript on the client side (sending JSON) and the Reactive Streams application on the server side would simply invoke onComplete to communicate back to the JS in the browser that the request is finished. This would work slightly differently in a WebSocket where the responses might come back individually rather than all at once as in the body of an HTTP response.

benjchristensen commented 10 years ago

After a great meeting today at Netflix with @tmontgomery I am more convinced that we should expand this out into several projects.

I propose we focus this main reactive-streams project on the contract, semantics and possibly network protocol definitions, and move the JVM implementation into reactive-streams-jvm along with other projects for the various other platforms.

To start with I think we need at least these:

We could also (if we're ambitious) include reference implementations of the network protocol, at least in Java and Javascript. Or we leave that to external projects to implement according to the network protocol.

Todd and I are ready to start working on defining the websockets protocol compliant with reactive-streams and Netflix is ready to start implementing on top of it. Do we have agreement on making reactive-streams polyglot and include the network protocol? If so, can I proceed to create the new repos and migrate the JVM interfaces into reactive-streams-jvm?

jbrisbin commented 10 years ago

Enthusiastic +1 from me. What can I do to help?

smaldini commented 10 years ago

+1

Sent from my iPhone

On 2 May 2014, at 03:12, Ben Christensen notifications@github.com wrote:

After a great meeting today at Netflix with @tmontgomery I am more convinced that we should expand this out into several projects.

I propose we focus this main reactive-streams project on the contract, semantics and possibly network protocol definitions, and move the JVM implementation into reactive-streams-jvm along with other projects for the various other platforms.

To start with I think we need at least these:

/reactive-streams/reactive-streams contract and semantics network protocol? /reactive-streams/reactive-streams-websockets websockets network protocol /reactive-streams/reactive-streams-jvm JVM interfaces /reactive-streams/reactive-streams-javascript We could also (if we're ambitious) include reference implementations of the network protocol, at least in Java and Javascript. Or we leave that to external projects to implement according to the network protocol.

Todd and I are ready to start working on defining the websockets protocol compliant with reactive-streams and Netflix is ready to start implementing on top of it. Do we have agreement on making reactive-streams polyglot and include the network protocol? If so, can I proceed to create the new repos and migrate the JVM interfaces into reactive-streams-jvm?

— Reply to this email directly or view it on GitHub.

tmontgomery commented 10 years ago

We probably only need 1 protocol spec that can run over any reliable unicast protocol (TCP or WebSocket or something else). I really like the direction this work is heading!

benjchristensen commented 10 years ago

@rkuhn Do you agree with and support splitting into multiple sub-projects under the "github.com/reactive-streams" umbrella?

kirkshoop commented 10 years ago

+1 ! I was going to phrase it differently, but the result is the same. There are many ways to communicate the signals across an async boundary and the tradeoffs for a surface, change for each language and protocol (A return value from an on_next function might be okay but an ACK message for each on_next in a protocol is not so good.).

tmontgomery commented 10 years ago

Exactly.... communicating across an async [binary] boundary.

To that end, the types of transports that the protocol would need to support are, at least: TCP, WebSocket, and IPC (most likely in the form of a shared memory SPSC queue). Adaptation to a multicast media (or SPMC or MPMC queue) probably should be considered, but might need to be treated differently.

With request(n) flow control semantics, it shouldn't be necessary to ACK every message in the general case. For persistency semantics, though, request(n) could piggyback/infer message consumption.

rkuhn commented 10 years ago

Yes, enthusiastic +1 from me as well! I also agree with the proposed split into multiple sub-projects. The only part I am not sure I understand correctly is the “network protocols” part: there are different network protocols available today, some of which already have all desired semantics (plus some extraneous ones, like TCP) and some of which need dedicated protocol descriptions for supporting Reactive Streams (like UDP, raw Ethernet frames, etc.), and in addition people may choose to implement a stream transport on a completely new medium as well. Therefore in my opinion we can leave out the network protocol description from the reactive-streams project and add concrete ones like reactive-streams-udp as we get around to specifying them. The most important part then is to agree on the semantics—call it an abstract protocol if you will—and fix that first.

Apropos: I take it that we agree on #46, especially given this angle on the scope of the project? If so, please comment on that issue as well.

tmontgomery commented 10 years ago

TCP semantics, and by extension WebSocket, don't have some semantics that are needed. And as @rkuhn points out, has some additional unneeded and undesired semantics.

Specifically, TCP and WS have no concept of channels. So, while a single stream can be supported, multiple concurrent streams (with or without head-of-line blocking) can't be supported without an additional framing layer. However, protocols like SPDY and HTTP2 have a nice framing mechanism that could be leveraged directly.

Additionally, request(n) semantics require a control protocol on top to be implemented. The most notable issue that TCP introduces here is an impedance mismatch between initial reception via push and flow control. i.e. a TCP sender may slam in an initial segment (1500 bytes of data or more depending) without the receiver requesting it. Relying solely on TCPs flow control would introduce some potentially bad artifacts into how the back pressure works, IMO.

Such a framing and control layer would also be needed by any other transport (UDP, etc.) as well.

rkuhn commented 10 years ago

@tmontgomery I see, we were talking about slightly different things: what I meant was dedicating one TCP connection to the transfer of one stream, in which case no extra capabilities are needed beyond serialization and deserialization of the data items, using the event-based socket operations to convey data and demand. The difference between TCP and a Publisher/Subscriber pair is that that would act like a buffer:

--> Subscriber --> Buffer (TCP connection) --> Publisher -->

where the buffer size is given by output + input buffers plus network components and in-flight bits. As far as I can see (please correct me if I’m wrong) this still has the behavior of dynamic push & pull, i.e. it switches between the two according to which side is currently faster (with the man in the middle capping how fast the receiver can appear to be). Are there other artifacts or bad effects that I am missing?

If we want to transfer multiple streams over a shared medium (like one WebSocket or one TCP connection or one Unix pipe) then we will of course need some multiplexing and scheduling capability, including dedicated return channels for the demand signaling. I am not sure whether that should be our first goal, though, since this can be implemented on a primitive (single) stream by merging and splitting streams in the “application layer”.

If OTOH the transport mechanism already comes with channel support, then we should of course try to use that. One potential issue I see with just pushing data and demand across an HTTP2 WebSocket is that multiple streams will have to be balanced in a fair fashion somehow, which implies that we must specify a scheduler or the protocol stack needs to already come with one—is that already the case? (please excuse my ignorance, I have yet to read up on this topic)

tmontgomery commented 10 years ago

@rkuhn Even with a single stream over a TCP connection, I believe you will need to consider having a control protocol for the semantics as I have read so far. You are correct in that TCP has flow control (all flow control must be push and pull), but those semantics in TCP have subtleties. That is perhaps the bit that is being missed. In TCP, controlling flow control only by receiving data rate is a very coarse grained tool. And has some quirks.... one is the interaction of Nagle and Delayed ACKs, for example.

You are absolutely correct about multiple streams and scheduling. It becomes a scheduling problem immediately, but it is worse than that, actually. Here are a couple links that you might find interesting on the subject.

http://sites.inka.de/~W1011/devel/tcp-tcp.html http://www.ietf.org/mail-archive/web/tls/current/msg03363.html

Some of the complexity has creeped into HTTP2, which is unfortunate, as the protocol has no other option than to expose the controls to the application. And most applications won't have any clue how to handle the complexity.

However, I see that as a tremendous opportunity for reactive streams to bring value. It's a simpler mechanism bound to the application. It's in a perfect position to make this easier, much easier. And accessible for applications.

And multiple streams per connection should be considered the norm, IMO. A single stream only per TCP connection is going to be very limiting. A proliferation of TCP connections is not a good thing. In fact, one of the main motivations for device battery life, for example, is very few TCP connections to keep device out of the high energy state as much as possible. HTTP2 and SPDY multiplex requests and responses and use multiple streams for many reasons. One of which is to reduce TCP connection counts for browsers and servers.

With that in mind, standardizing how to mux multiple streams onto any transport is a good thing, I think.

rkuhn commented 10 years ago

Thanks a lot for this explanation, it entirely makes sense. Great to have you on board!

benjchristensen commented 10 years ago

multiple streams per connection should be considered the norm

Agreed, otherwise this doesn't work for us in what we're pursuing with both TCP and WebSockets.

benjchristensen commented 10 years ago

It seems we have consensus to migrate this initiative to being polyglot, so shall we move forward with making sub-projects and moving things around?

Shall we start with these?

1) /reactive-streams/reactive-streams

2) /reactive-streams/reactive-streams-websockets

3) /reactive-streams/reactive-streams-jvm

4) Modify www.reactive-streams.org to not be JVM only

@tmontgomery Do we need separate projects for TCP and websockets, or can they be together? If together, under what name?

rkuhn commented 10 years ago

I’d say we can start moving things around once we have fixed the discrepancies between code and documentation (i.e. #41 and possibly follow-up fixes are merged).

benjchristensen commented 10 years ago

I'm okay with that ... I'll keep tracking #41 and #46 closely this week so we can try and finish those discussions to unblock everything else.

rkuhn commented 10 years ago

Sounds good!

benjchristensen commented 10 years ago

@rkuhn Shall we proceed now that we've agreed in #46 and merged the contract definition?

Do you want to release 0.4 first from the current project structure and then proceed with the steps I outlined above in https://github.com/reactive-streams/reactive-streams/issues/45#issuecomment-42574722 ?

rkuhn commented 10 years ago

Splitting things up should not require any changes in interfaces or semantics, we are just moving things around (including that the generic README is stripped from JVM-specific provisions which move into the JVM subproject), so I do not see any obstacles towards doing it in parallel. The released artifacts are as far as I can see also not affected in any way by the split.

benjchristensen commented 10 years ago

Okay, so shall I proceed with submitting a PR to propose the changes?

Do you agree with the layout I suggested?

drewhk commented 10 years ago

I don't like the idea of specifying a protocol here for many reasons. one of them that it feels completely out of scope.

Even with a single stream over a TCP connection, I believe you will need to consider having a control protocol for the semantics as I have read so far.

The reactive streams semantics will allow a recipient to send a huge request if the potential set of answers fits its memory, but that does not translate to the kernel buffer size which will eventually has to deal with the incoming binary data from the wire. The semantics do not map one-to-one, so I think this is misguided.

You will inevitably need a local bridge Subscriber that communicates properly with whatever underlying kernel driver it must talk to and gives proper request counts.

tmontgomery commented 10 years ago

@drewhk Actually, what you mention makes the case for there being a control protocol. Relying on TCP semantics here alone is not enough because of the possibility of overrunning the subscriber side if the obvious solution were to be used.

In the obvious solution, there must be double buffering outside the kernel SO_RCVBUF and SO_SNDBUF. In that model without a control protocol, for large objects, the buffers become, effectively, unbounded. There is no visibility to bound them on the publishers side without appropriate reactive streams level feedback. That feedback may be based on objects instead of bytes, but object size can be bounded (and should be and is in any real system e.g. via fragmentation).

Whether a bridge subscriber is used as a solution is open for debate. It's hardly the inevitable solution, IMO.

In most cases, having event units that are in the multi GB range, means a range of tradeoffs to be made. Adding another point to buffer multiple fragments into a single unit (like a bridge subscriber would) is a less than ideal solution. I would handle that by only reassembling at the end subscriber site. Where it has to be... unless the system decides that smaller units are the way to go anyway.

drewhk commented 10 years ago

Actually, what you mention makes the case for there being a control protocol. Relying on TCP semantics here alone is not enough because of the possibility of overrunning the subscriber side if the obvious solution were to be used.

I am not sure what you mean here. Just because a remote Subscriber requests for 1 000 000 elements that does not mean that the underlying transport should also request 1 000 000 elements. It might even -- ad absurdum -- request elements one by one and have a buffer size of exactly 1 element and eventually stil serves the 1 000 000 requests. There is no unboundedness here.

Another example similar to this, just because you have a chain of purely asynchronous map stages:

map.map.map.map

And the downstream subscriber for the last map requests 1 000 000 elements, that does not mean that the map stages between each other will also issue a request for 1 000 000. That would mean that each step is forced to have a buffer that can hold 1 000 000 elements worst case. On the other hand they can have a buffer size of 1, 128, or even different buffer sizes in-between, and issuing requests in even smaller batches (say, bufSize / 2 for example).

tmontgomery commented 10 years ago

Further on in my comment, I actually mention large objects (i.e. large elements). Which is what I thought you meant. Overrun in that case is because of a single large object. And a request of 1. That has overrun possibilities unless handled well. i.e. if you request 1GB in a single element. Which users will do.

The underlying implementation can optimize down to a single element for a pipeline. In fact, it should. So, the 1M elements you mention can be handled a number of ways fine. However, below a single element, the only possibility is fragmentation. Which needs to be reassembled. Which works best with a framing protocol and definition of MTU. Without that, you are left with very few options how to handle it efficiently.

danarmak commented 10 years ago

The discussion in #47 may be relevant here.

The Reactive Streams interface allows requesting n 'elements', but if the element type is a byte array and not a single byte, there is no way to request a number of bytes. It's impossible to "request 1GB in a single element"; if you request a single element of type byte array, the publisher can give a single byte array of any size it wishes and still be within the spec.

An implementation can introduce controls for the byte array size, but if the on-the-wire protocol is some standardized form of Reactive Streams, it won't be able to communicate this to the other side. In #47 I said this made 'byte streams' between different implementations over the network impractical. The response was that the implementation talking to the network on each end should document its buffer size (the size of the byte arrays it produces), and specific publishers should also document the chunk sizes they produce, and then consumers can rely on those two things. We'll see in practice if this is good enough.

What is the element type of your Reactive Stream? If it's byte[], then there is no way to signal the size of the byte[] you want. If it's some object, but that object type's size can vary greatly and it can be split and merged (eg HTTP message chunks), the same problem exists. The type can't be byte because then you have to call onNext for each byte.

If on the other hand the element type can't be split and merged, then you have no choice but to request some whole amount of elements. If a single element can be 1GB in size, but you can only process whole elements, then you don't have a choice but to buffer it. If you don't want to buffer it, write a streaming processor and change your element type to a smaller fixed-size frame.

jbrisbin commented 10 years ago

I would imagine that an implementation wanting to stream raw byte[]s would provide a bounded Buffer facility that would allow you to tell the Publisher on configuration, how big of a chunk to use. Then request(10) doesn't mean 10 byte[Integer.MAX_VALUE] but 10 bounded Buffers. This would be part of the initial configuration of the Publisher I would expect. One could imagine it being volatile as well so that the buffer size could be changed ad hoc but that would be determined by the implementation.

I don't think this particular issue can be spec'd away. IMO this is going to have to be a best practice given the intent of the spec to always be bounded. Since the spec intends handling N elements to be finite, it seems logical to assume that those objects also need clear bounds to fit into the "spirit" of the spec.

It should be possible to spec out how to handshake a Publisher/Subscriber pair over an IO boundary such that both parties agree on the bounds. If a Subscriber can only successfully subscribe to a Publisher after agreeing on a buffer size, then it's not necessary to stipulate how that agreement is arrived at--or its specifics--but that there was a mechanism in place that allowed the two parties to agree to acceptable terms before beginning the "real" process of exchanging data.

This is how I see a "control protocol" being useful to the spec. Its at a much higher level than maybe what we're used to thinking of when using that phrase. Maybe it's closer to a "handshake protocol" or somesuch.

danarmak commented 10 years ago

I also think that a handshake protocol would be useful. It should be opt-in, to keep the base spec simple, since many users deal with objects of a predictable size, and don't need any negotiation. Each side could also refuse to communicate if the other side didn't negotiate. There should be a reasonable way to behave in its absence when transporting e.g. byte arrays.

When should it be used? It's easiest to leave it to the application to decide, but what do we expect? Only across networks? Across inter-process boundaries? Whenever a Reactive Stream has an element type that is an array, string or similar? Would it usually be end-to-end, or point-to-point?

jbrisbin commented 10 years ago

IMO any time the data will cross a substantially expensive "boundary" this should apply. That might mean crossing from userland into network, from business logic into serialization, from one JVM to another process, from client to server, etc... The boundaries don't have to involve IO but mostly likely will 90% of the time.

tmontgomery commented 10 years ago

Negotiation would be a pretty useful option. Which could be done in the subscribe. Negotiation ala TLS/SSL, no. However, negotiation ala WebSocket protocol/extension or HTTP/2, yes. Simpler forms make more sense. Simply a union of capabilities on the exchange.

Any point where (de)serialization is required (due to crossing a binary boundary - network, disk, process) will benefit from having the following:

  1. Means to frame elements that includes fragmentation/reassembly (but NOT additional element serialization)
  2. Means to provide stream multiplexing
  3. Means to control flow in accordance with agreed to semantics beyond transport flow control
  4. Means to negotiate Publisher/Subscriber behaviors/options/configuration

Are there additional thoughts? (maybe this should be a separate issue to discuss?)

drewhk commented 10 years ago

I believe sending 1GB elements is an abuse of streams since the main purpose is to be able to represent large datasets as a partially materialized stream instead of a huge blob -- elements should be bounded. Fragmentation and reassembly is just a simple stream processing element, and the mentioned negotiation can also be expressed in terms of streams. I understand that a standardized way is useful, but I feel this is just too early.

Means to frame elements that includes fragmentation/reassembly (but NOT additional element serialization) Means to provide stream multiplexing Means to control flow in accordance with agreed to semantics beyond transport flow control Means to negotiate Publisher/Subscriber behaviors/options/configuration

While the above are good ideas, I don't think it is a good idea to rush this. Introducing an n+1th protocol is not an effort that should not be underestimated (and the technical effort might be the smallest one here). For example multiplexing requires a platform independent way to identify stream endpoints (an URI scheme? have a registry as part of the SPI? etc.) that raises a lot of questions already, without even touching the question of a custom transport/flow-control protocol.

tmontgomery commented 10 years ago

And no spec has been proposed...

So, don't think there is a rush, really. I've certainly got enough going on that I have to be careful how much time to devote to the effort anyway. I'm passionate about the entire effort, though, and would like to not see it suffer from the crippling mistakes of JMS.

Good news is that we're not reinventing the wheel. There are several good protocols/designs to build from. Nothing I have seen here is that new or difficult.

Multiplexing and how to identify endpoints is well understood and used by a number of protocols dating all the way back to SOCKS and beyond. I see no reason why the HTTP/2 and SPDY model wouldn't work for this. It's almost directly analogous and it doesn't mean the lookup has to be tied to anything specific out of the gate and it can evolve separately if needed. In short, the "name" of a stream can be a blob, that blob generates (via allocation by one side) an ephemeral ID for efficient exchange that has a given lifetime for the connection.

Flow control is both understood and not understood at the same time. HTTP/2 is handling a similar, yet more encompassing problem and making it quite complex. The needs here are much much simpler and for the most part more understood. At least until implementation constraints would invalidate them. In short, make request mimic TCP flow control at the object level. Divorce the underlying transport buffering from the application constraints so we can avoid issues like TCP interaction between Nagle and Delayed ACKs without devolving to implementation specific workarounds.

The only way I would think a pure transport protocol (akin to TCP, WebSocket, PGM, etc.) would be required to be speced would be in the area of reliable multicast.... But, I don't think a custom transport protocol for reliable multicast is needed. There are already several reliable multicast protocols that can fit... and there could be more on the horizon as well. However, an application control protocol on top would be needed.

danarmak commented 10 years ago

I think a way to send Reactive Streams on top of HTTP would be useful too. There are many places you can reach with HTTP but not with a custom TCP based protocol, because of firewalls blocking everything else, and because of server components running inside HTTP app servers.

To clarify, I'm certainly not saying that only HTTP should be supported. HTTP wouldn't be as efficient or nearly as convenient as a custom solution.

tmontgomery commented 10 years ago

HTTP/1.1 is problematic for the semantics needed. Specifically, streaming from server to client without resorting to long polling. However, HTTP/1.1 with WebSocket can work and is mostly OK wrt app servers and load balancers. i.e. there are numerous solutions that support WebSocket fine, but not every old component works off the shelf as required. For an old netgear sitting in someones living room using HTTP/1.1 with WebSocket to connect out, you are fine, for example. Most 101 switching protocols, like WebSocket, work fine with firewall traversal. Most modern app servers support WebSocket as well. If not full JSR 356 compliance. Even if app servers are a terrible model for deployment. Load balancers are a mixed bag, but good solutions exist and more are coming.

HTTP/2 (or SPDY) is another option, but brings a lot of mandatory baggage.

BTW, I am talking at QCon NY on this stuff and will be mentioning Reactive Streams quite a bit.

viktorklang commented 10 years ago

On Mon, May 26, 2014 at 8:16 PM, Todd L. Montgomery < notifications@github.com> wrote:

HTTP/1.1 is problematic for the semantics needed. Specifically, streaming from server to client without resorting to long polling. However, HTTP/1.1 with WebSocket can work and is mostly OK wrt app servers and load balancers. i.e. there are numerous solutions that support WebSocket fine, but not every old component works off the shelf as required. For an old netgear sitting in someones living room using HTTP/1.1 with WebSocket to connect out, you are fine, for example. Most 101 switching protocols, like WebSocket, work fine with firewall traversal. Most modern app servers support WebSocket as well. If not full JSR 356https://jcp.org/en/jsr/detail?id=356compliance. Even if app servers are a terrible model for deployment. Load balancers are a mixed bag, but good solutions exist and more are coming.

HTTP/2 (or SPDY) is another option, but brings a lot of mandatory baggage.

What's your take on this: http://lists.w3.org/Archives/Public/ietf-http-wg/2014AprJun/0815.html ?

BTW, I am talking at QCon NYhttps://qconnewyork.com/presentation/evolving-rest-iot-worldon this stuff and will be mentioning Reactive Streams quite a bit.

Sounds like a lot of fun, I wish I could be there!

— Reply to this email directly or view it on GitHubhttps://github.com/reactive-streams/reactive-streams/issues/45#issuecomment-44208797 .

Cheers, √

danarmak commented 10 years ago

Alternatives to HTTP/1.1 are better when available, but sadly, deployment options in many large organizations are limited to HTTP.

Here's a more general question: what transport assumptions should be made by the Reactive Streams remote protocol specification (which this conversation is about)?

Should it assume an opaque stream-type transport to be given (TCP, pipes, websockets, optional TLS, etc) or should it concern itself with lower level stuff like the underlying flow control?

Should it be couched in terms of sending messages over the underlying channel (which could be implemented on top of any message bus in theory), or in terms of managing the transmit-receive buffers (relying on having the stream to itself and trying to optimize from there)?

tmontgomery commented 10 years ago

@viktorklang the HTTP/2 WG is going through a healthy debate right now. As proposed h2 has a lot of mandatory pieces and a lot of really ugly stuff to straighten out. The WG is suffering right now from a push that is being driven by the pain of poor design decisions in the past in HTTP/1 that is only getting more acute. I recommend the thread for anyone who is interested in the future of web protocols.

viktorklang commented 10 years ago

On Mon, May 26, 2014 at 9:29 PM, Todd L. Montgomery < notifications@github.com> wrote:

@viktorklang https://github.com/viktorklang the HTTP/2 WG is going through a healthy debate right now. As proposed h2 has a lot of mandatory pieces and a lot of really ugly stuff to straighten out. The WG is suffering right now from a push that is being driven by the pain of poor design decisions in the past in HTTP/1 that is only getting more acute. I recommend the thread for anyone who is interested in the future of web protocols.

Standards are hard, and standards groups are a challenge. :) I'll read the thread for sure

— Reply to this email directly or view it on GitHubhttps://github.com/reactive-streams/reactive-streams/issues/45#issuecomment-44212941 .

Cheers, √

tmontgomery commented 10 years ago

@danarmak IMO and experience, I would not assume a stream-based abstraction. And dangerous to rely on buffer sizing (if I understand your suggestion). TCP has some dark corners at high speed to work around, for example.

Here's a more general question: what transport assumptions should be made by the Reactive Streams remote protocol specification (which this conversation is about)?

Should it assume an opaque stream-type transport to be given (TCP, pipes, websockets, optional TLS, etc) or should it concern itself with lower level stuff like the underlying flow control?

The assumptions that make the most sense would be transports that span stream-based, datagram-based, and (R)DMA*. All require framing and all can use the same framing.

Not making assumptions based on transport flow and congestion control as the huge advantage this entire work has is application level flow control. And besides, it doesn't exist for UDP (congestion control, yes, flow control, no) or (R)DMA (where flow is done differently).

Should it be couched in terms of sending messages over the underlying channel (which could be implemented on top of any message bus in theory), or in terms of managing the transmit-receive buffers (relying on having the stream to itself and trying to optimize from there)?

Sending framed messages over the underlying channel makes more sense.

Simply managing the transmit-receive buffers is attractive, but very tricky. See HTTP/2 WG archive for some of the issues. Be great to avoid this hairball. Think of it this way, double buffering is required to avoid some behaviors at speed. That introduces indirection into how to control flow. A little like trying to push a door closed with a rope. That is loss of control and a spiral into more and more buffering to compensate and thus more indirection. For more on that, I suggest looking at what has happened with the evolution and combating of buffer bloat. It's a little different, but the results are the same.

BTW, using (R)DMA to illustrate various techniques that involve lock-free/wait-free mechanisms on top of shared memory. Whether that be InfiniBand, RoCE, PCI-e3, or good ol' L2/L3 memory.

danarmak commented 10 years ago

The assumptions that make the most sense would be transports that span stream-based, datagram-based, and (R)DMA*. All require framing and all can use the same framing.

You referred to problems with TCP. Do those problems go away if you handle flow control at the application level, while still running on top of TCP?

Does the application-level algorithm need to include TCP-specific behavior to perform well, and how many underlying transports would we end up supporting (in the spec, not in an implementation)? At a minimum, it would have to take into account whether the underlying transport does its own retrying or not, and maybe buffering related issues (TCP window scaling...), but I'm no expert on TCP. There are presumably issues with TCP routers along the way...

Would running on TCP be significantly less efficient than running the same algorithm on UDP? Would a naive algorithm on top of TCP be much worse than a TCP-optimized variant?

tmontgomery commented 10 years ago

You referred to problems with TCP. Do those problems go away if you handle flow control at the application level, while still running on top of TCP?

No, they don't go away... unless you aren't using TCP. However, the problems can be handled in a standardized way instead of being implementation specific and often not interoperable in a consistent way between implementations. So, the big advantage is consistency for interoperability. And some simplification as a result if done well.

Does the application-level algorithm need to include TCP-specific behavior to perform well, and how many underlying transports would we end up supporting (in the spec, not in an implementation)? At a minimum, it would have to take into account whether the underlying transport does its own retrying or not, and maybe buffering related issues (TCP window scaling...), but I'm no expert on TCP. There are presumably issues with TCP routers along the way...

The application-level protocol doesn't need to include any TCP-specific behavior. Nor should it. Three transport "classes" suffices. Reliable, best effort, end-to-end delivery is a decent base assumption that allows TCP, reliable multicast (like NORM), and shared memory options.

Would running on TCP be significantly less efficient than running the same algorithm on UDP? Would a naive algorithm on top of TCP be much worse than a TCP-optimized variant?

This is complex to answer. There are few absolutes. Flow control is different than congestion control. Which makes this hard to answer in a general case comparing TCP and unicast UDP on arbitrary networks. Also, the big problem with TCP and multiple streams is going to be head-of-line blocking. The results of which are increased delay/latency. This is very noticeable for REST APIs, for example.

Multicast UDP opens up additional questions and tradeoffs wrt flow and congestion control that effect perceived performance. RFC 4654 is a place to look for more on congestion control. There really isn't viable flow control solutions for multicast, but the same issues exist as with congestion control, just need different solutions.

A naive algorithm on top of TCP is almost assuredly going to be an order of magnitude (if not more) off where it could be if appropriately optimized for TCP. Could even be worse if the naive solution reintroduces silly window on top of TCP. Which is easy to do with the semantics of reactive streams.

BTW, window scaling is an artifact of TCP usage beyond the original design. Specifically, high bandwidth delay product networks, like satellite, in an effort to make them more efficient at bandwidth utilization.

drewhk commented 10 years ago

This is complex to answer. There are few absolutes. Flow control is different than congestion control. Which makes this hard to answer in a general case comparing TCP and unicast UDP on arbitrary networks. Also, the big problem with TCP and multiple streams is going to be head-of-line blocking. The results of which are increased delay/latency. This is very noticeable for REST APIs, for example.

Head-of-line blocking will happen in any bounded case even if you have separate demultiplex buffers for the substreams. Once one of them fills up you are still unable to dequeue the next frame from the TCP buffer since it might end up just in the filled up buffer, and you don't know where it would end up without dequeueing. As for the general case, yes comparing UDP, TCP or any transport protocol is impossible on arbitrary networks (just take mobile, where you have multipath-fading, soft and hard handoffs, continuously changing bitrates because of adaptive modulation and of course any form of IP mobility solution that is in the way).

A naive algorithm on top of TCP is almost assuredly going to be an order of magnitude (if not more) off where it could be if appropriately optimized for TCP. Could even be worse if the naive solution reintroduces silly window on top of TCP. Which is easy to do with the semantics of reactive streams.

Do you have an example of this?

drewhk commented 10 years ago

Correction of myself: head-of-line blocking as in waiting for a missing piece (gap) in a sequenced stream is obviously an issue, which causes all substreams to suffer the potential RTT of a redelivery of a missing piece from unrelated substreams. Solving this issue obviously need proper substream channel at the transport level (i.e. cannot be built on top of TCP).

viktorklang commented 10 years ago

On Tue, May 27, 2014 at 11:30 AM, drewhk notifications@github.com wrote:

Correction of myself: head-of-line blocking as in waiting for a missing piece (gap) in a sequenced stream is obviously an issue, which causes all substreams to suffer the potential RTT of a redelivery of a missing piece from unrelated substreams. Solving this issue obviously need proper substream channel at the transport level (i.e. cannot be built on top of TCP).

Which means something like SCTP, QUIC, etc?

— Reply to this email directly or view it on GitHubhttps://github.com/reactive-streams/reactive-streams/issues/45#issuecomment-44253498 .

Cheers, √

tmontgomery commented 10 years ago

A naive algorithm on top of TCP is almost assuredly going to be an order of magnitude (if not more) off where it could be if appropriately optimized for TCP. Could even be worse if the naive solution reintroduces silly window on top of TCP. Which is easy to do with the semantics of reactive streams.

Do you have an example of this?

The link to interaction of Nagle and Delayed ACKs that I posted is one. Just keep playing with the send sizes in relation to buffer sizes as well as larger round trip time. It can get very bad.

tmontgomery commented 10 years ago

@drewhk @viktorklang

Which means something like SCTP, QUIC, etc?

Something like it, yes.

drewhk commented 10 years ago

The link to interaction of Nagle and Delayed ACKs that I posted is one. Just keep playing with the send sizes in relation to buffer sizes as well as larger round trip time. It can get very bad.

Ok, but I thought you are talking about onNext, request from a Subscriber side. In fact this is exactly why I like to isolate the local backpressure schedule from the link flow control/batching/framing schedule.

We discussed things with @rkuhn with some beers, and he convinced me that substream request signals should actually be propagated through the channel (to avoid the scenario I sketched where the demultiplexer is blocked because one of the substream output buffers are full). So I think the best option is that the local Publishers representing the substream should bridge the request/onNext schedule from its local consumers and buffer/progress according to their own schedule that fits best the underlying transport/channel.

The bridging for example prevents a scenario where the local Subscriber operates strictly in a Stop-And-Wait fashion (always requesting exactly one after each consumed element), since the local bridge can protect the underlying channel from this schedule and work in a proper windowed mode while still serving the local Subscriber in the way it wants. It still sends its own request signals through the channel though, so that if it ever becomes full, other substreams can still progress since the sender will not send frames corresponding to the blocked substream, therefore the demux will not block.

drewhk commented 10 years ago

Which means something like SCTP, QUIC, etc?

Well, SCTP is not as widely available, so that leaves QUIC which uses UDP. Or we build something on top of UDP taking the parts from QUIC that are relevant.

Btw. isn't the number of available substreams in SCTP or QUIC limited? It was long ago when I read the specs, but I recall max 64 subchannels in SCTP.

tmontgomery commented 10 years ago

@drewk SCTP is used with WebRTC now. So, slightly more common. Which uses it over DTLS over UDP. Normal uses of SCTP were using IP directly (IP protocol number 132, IIRC). Which presents some additional barriers to overcome.

SCTP stream field is 16-bits. But implementations may limit this to lower than the 64K available. Not sure about QUIC limits off hand. The state management of substreams/channels doesn't come for free. There is always some limit beyond just the field size.

Some messaging systems can support substreams without head-of-line blocking. 0MQ and Ultra Messaging are the immediate ones that come to mind, but others might also.

Also, there is some other work being done in this area that will be public in a few months.

I don't think we will need to do our own transport protocol. We can ride on top of existing ones if we design it well enough.

viktorklang commented 10 years ago

Is this something that will be pursued (i.e. should we close it or who wants to "drive" it?) @reactive-streams/contributors