Multicast Requirement - Githubissues

benjchristensen commented 10 years ago

Currently the spec states "A Publisher can serve multiple subscribers subscribed dynamically at various points in time. In the case of multiple subscribers the Publisher should respect the processing rates of all of its subscribers (possibly allowing for a bounded drift between them)."

I think this is a mistake to complicate the spec and require implementations to support multicasting and therefore management of subscriptions over time. In fact, I think it should expressly stick to unicast (new lifecycle per Subscription).

Multicasting techniques should be layered on top by libraries, not required of the Publisher instances themselves.

For example, each Subscriber could result in a new network connection, open file, etc. This would be a basic implementation.

In Rx this greatly simplifies things and is a good separation of concern. Multicasting can be added and done with different behaviors such as replaying all of history, a portion of history, the last value only, or ignoring history and just starting from now onwards, etc.

In other words, someone providing me a Publisher should not concern themselves with multiple subscriptions, how I want to multicast or other such things. If I subscribe it should start a new lifecycle, emit the data as I request it and stop when I unsubscribe.

This keeps the mental model clear, the Publisher implementations simple and allows libraries a simple contract to interface with.

benjchristensen commented 10 years ago

I have committed unicast and multicast examples using these types at https://github.com/reactive-streams/reactive-streams/tree/simplest-viable-types/spi/src/examples/java/org/reactivestreams/example

To see them run, execute these classes:

MulticastExample.java => https://github.com/reactive-streams/reactive-streams/blob/simplest-viable-types/spi/src/examples/java/org/reactivestreams/example/multicast/MulticastExample.java UnicastExample.java => https://github.com/reactive-streams/reactive-streams/blob/simplest-viable-types/spi/src/examples/java/org/reactivestreams/example/unicast/UnicastExample.java

rkuhn commented 10 years ago

@sirthias The contract of Publisher would require that you can always subscribe multiple times (modulo actual failures of the stream, but that is unrelated). Therefore your reasoning about LSP is inverted.

@benjchristensen For the reason above, Akka (or any other implementation) cannot introduce a SingleUsePublisher in a way that makes sense and allows interoperability. (That Rx uses Observable for this purpose only serves to prove my point: you use it where a full-blown Publisher is not appropriate, and it does not interoperate with other Reactive Streams implementations directly.)

Due to the necessity of merging the request streams from multiple Subscriptions, a Publisher is limited in which kinds of concurrent data structures it uses (having only a single producer allows less synchronization overhead compared to a solution which must cope with multiple producers). In order to allow these optimizations, a SingleUsePublisher makes a lot of sense.

Of course your examples work, and of course it is possible to always implement Publisher, but it is unnecessarily limiting on the performance optimization side of things. (I can see how at least in the example you didn’t care, using ArrayBlockingQueue and AtomicInteger.getAndSet instead of a wait-free SPSC queue and lazySet.) I am sure that there are also others who would be interested in being able to optimize things. @normanmaurer @jbrisbin, care to offer an opinion?

The other point which apparently was not presented well enough: if we say that data sources are not required to support multi-cast (the original purpose of this Issue), then we need to include a type that makes this possible. Publisher forces all sources to support multi-cast that do not have the luxury of duplicating themselves, like the already accepted TCP connection on a server socket. This has nothing to do with Akka, it is a matter of matching up wording and intent.

sirthias commented 10 years ago

@rkuhn Ok. That would mean that singleUsePublisher.subscribe(subscriber) is guaranteed to call subscriber.onSubscribe for the very first call and (instead) subscriber.onError for all subsequent ones. Since Publisher on the other hand would be required to accept unlimited numbers of subscribers publisher.subscribe(subscriber) would guarantee that subscriber.onError is never called instead of subscriber.onSubscribe. Then, however, Publisher extends SingleUsePublisher would violate LSP because code relying on subscribe to call onError for the second subscriber would break if I replace singleUsePublisher with publisher, no?

rkuhn commented 10 years ago

@sirthias No, SingleUsePublisher would be free to do whatever it wants after the first subscribe, meaning that a Publisher is a valid SingleUsePublisher as well. (The guarantee you made up was a bit stricter than what I had in mind ;-) )

sirthias commented 10 years ago

@rkuhn Ah, ok. So SingleUsePublisher must support at least one subscription, while Publisher would have to support an unlimited number of subscribers. However, if this is what you have in mind then I agree with @benjchristensen in that adding SingleUsePublisher doesn't appear to add a lot of value and we could also just collapse down to a single Publisher type with semantics "one or more subscribers".

rkuhn commented 10 years ago

@DougLea Concerning the extra round-trip during subscription establishment, you are right in principle, but saving that one-time overhead would have a cost in API duplication (i.e. two ways to do “essentially the same thing”) that I don’t think is properly justified by the benefits. There is also some benefit in only starting to produce elements when the Subscriber has acknowledged that it is ready (since the Subscriber will usually not be the one who invokes subscribe), meaning that we would have to allow a value of 0 which is not legal for subsequent request calls.

About the buffer management: that is completely up to the Subscriber, the Publisher/Subscription only is informed about the currently outstanding demand (which may or may not be related). Hence there is no need to inform the Publisher about this internal choice of the Subscriber.

rkuhn commented 10 years ago

@sirthias Well, so far @benjchristensen has been adamant about not allowing a Publisher to opt out of supporting an unlimited number of Subscriptions; that is what this discussion is all about.

The idea of the SingleUsePublisher is that implementing it does not say “at least one Subscription”, it says “exactly one Subscription”. And using a SingleUsePublisher means “I will only subscribe once”. Supporting more than one Subscription obviously fulfills the implementation contract, and using a Publisher as if it was a SingleUsePublisher also obviously is not problematic.

sirthias commented 10 years ago

My point was that, semantically, we can have

ExactlyOncePublisher extends AtLeastOncePublisher

and/or

AlwaysManyPublisher extends AtLeastOncePublisher

but ExactlyOncePublisher and AlwaysManyPublisher cannot directly be in any kind of sub-typing relationship. If @benjchristensen really deems the AlwaysManyPublisher semantics as crucial then the only option is introducing an AtLeastOncePublisher supertype (with a proper name), which has the option of rejecting subsequent subscriptions with subscriber.onError. I guess that is your proposal. (I was confused by the SingleUsePublisher name).

viktorklang commented 10 years ago

Just my 2c, from what I read @benjchristensen wants Publisher to be the inverse of Iterable, others are saying that there is a need for not only have the inverse of Iterable, but also the inverse of Iterator (i.e. single consume). I think this has merit. But, perhaps instead of trying to shoehorn it into a hierarchy, just do the same things (but inverted) as with Iterable&Iterator, have the Publisher be able to produce something that is consume once, and don't we already have that, it's a Subscription?

A Subscription can only ever be connected to one Subscriber. A Publisher can create many Subscriptions.

Or?

DougLea commented 10 years ago

I don't understand the need for Publisher subtypes if the subscribe method can always fail with an exception anyway? Although to help people avoid them in common cases, this could be replaced with boolean trySubscribe()

benjchristensen commented 10 years ago

That Rx uses Observable for this purpose only serves to prove my point: you use it where a full-blown Publisher is not appropriate, and it does not interoperate with other Reactive Streams implementations directly.

@rkuhn If the rx.Observable were to implement org.reactivestreams.Publisher and rx.Subscriber implement org.reactivestreams.Subscriber, it would, wouldn't it? I don't see how it serves a point requiring a single use publisher.

I can see how at least in the example you didn’t care, using ArrayBlockingQueue and AtomicInteger.getAndSet instead of a wait-free SPSC queue and lazySet.

I was implementing an example with the least amount of dependencies - the JDK. Nothing I did in the example limits the underlying implementation from using alternate queues, even Disruptor. Is there anything I'm not seeing in those examples that limits more performant implementations?

if we say that data sources are not required to support multi-cast (the original purpose of this Issue), then we need to include a type that makes this possible.

Let's not confuse multi-subscribe with multi-cast. A Publisher can be subscribed to multiple times, but it is free to control each Subscription and make them be unicast or multicast. That is why I started this issue - to decouple subscribing from the requirement to support multicast. It does not mean multicast can't or shouldn't done though.

As @viktorklang states above, we already have 2 of the 3 types supporting "single use", both the Subscription and Subscriber. It is only the factory type, the Publisher that supports reuse with multi-subscribe.

Publisher forces all sources to support multi-cast that do not have the luxury of duplicating themselves

No, that is the choice of the Publisher, not the underlying source. If an implementation really does want to limit the source to a single Subscriber, then it emits onError when someone attempts to subscribe.

Even in the SingleUsePublisher, it still exposes subscribe which will have to defend against multiple subscriptions and either throw an exception or emit onError. Thus, there is no difference between Publisher and SingleUsePublisher in their defense against multi-subscribe if they have a use case that can't support it. This is I believe what @DougLea is also stating above.

have the Publisher be able to produce something that is consume once, and don't we already have that, it's a Subscription?

@viktorklang I agree with this and that's how I see it. The Publisher allows multi-subscribe (but could onError if it wants to enforce single-subscribe) and then Subscription and Subscriber are single-use only.

Since the ability to reject a subscription needs to be further clarified, I'd update the specifications to be something like this (note the 4th bullet point):

Subscriber can be used once-and-only-once to subscribe to a Publisher.
a Subscription can be used once-and-only-once to represent a subscription by a Subscriber to a Publisher.
The Publisher.subscribe method can be called as many times as wanted as long as it is with a different Subscriber each time. It is up to the Publisher whether underlying streams are shared or not.
A Publisher can refuse subscriptions (calls to subscribe) if it is unable or unwilling to serve them (overwhelmed, fronting a single-use data sources, etc) and can do so by immediately calling Subscriber.onError on whatever Subscriber calls subscribe.
Events sent to a Subscriber can only be sent sequentially (no concurrent notifications).
Once an onComplete or onError is sent, no further events can be sent.
Once a Subscription is cancelled, the Publisher will stop sending events as soon as it can.
A Publisher will never send more onNext events than have been requested via the Subscription.request/signalDemand method.

rkuhn commented 10 years ago

Okay, we seem to converge (with my noted protest against the too amorphous Publisher type), but before making closing remarks, I’ll wait for @jbrisbin and @normanmaurer to chime in.

sirthias commented 10 years ago

For me there is one remaining question with regard to the Publisher semantics: What are the implications for Processor, i.e. a Publisher with its own upstream?

If a Publisher is regarded as a factory for independent unicast streams, shouldn't a Processor then be a factory for its processing logic? And wouldn't this entail that a Processor would have to open another subscription to its upstream whenever it receives an additional subscribe from its downstream side, so it can chain in a fresh incarnation of its transformation logic? How would we be able to avoid the overhead resulting from such multiple "parallel" upstream subscriptions? Sorry if I am not seeing things right here...

rkuhn commented 10 years ago

We will have to revisit the role of Processor in the context of #25 after we concluded this one, let’s not mix everything together.

sirthias commented 10 years ago

Ok, no problem. The question of how "secondary" Publishers (i.e. a Publisher interface for transformation logic attached to an upstream Publisher) are meant to work appears to be tightly related though, no matter whether we actually have a dedicated Processor abstraction for them or not.

rkuhn commented 10 years ago

With the currently pending compromise we are saying that a Publisher is entirely free to do whatever it wants when it receives a subscription request: it could allow fan-out of the very same stream with any semantics it chooses, or it could connect back to the real source a second time, or it could reject the request (or other options which escape my fantasy). This is the lowest common denominator that allows all implementations to work as they want, which is not entirely surprising to me since the production of data streams is outside of the core scope of this specification, the purpose of which is to define how to transfer data across an asynchronous boundary.

sirthias commented 10 years ago

I see. This means that an implementation will probably pick a particular fanout style (like "independent unicast" or "exactly-once" or whatever) and apply it throughout its own universe. On the boundaries between implementations we won't be able to rely on any concrete fan-out semantics though, which will effectively make fan-out unusable at these "gateway" points (if you don't know what implementation is actually backing a certain Publisher instance). IMHO this is not necessarily a bad thing, just something to keep in mind.

benjchristensen commented 10 years ago

which will effectively make fan-out unusable at these "gateway" points

Can you elaborate on a use case where interop is hindered by not knowing? I can't think right now of why a consumer would need to know how a producer functions (multicast vs unicast) and would like to understand.

sirthias commented 10 years ago

@benjchristensen I didn't want to imply that your proposal hinders interop. All I meant was that, when I have a Publisher of uncertain origin I can't know whether I can subscribe multiple times or not. Which is not really a problem because I can (and should) attach my own fan-out logic in these cases, if I need it. So, overall, the proposal enables max flexibility in the realms of the individual implementations while allowing for all the interop that is required. To me this sounds quite like what a common API needs to provide (i.e. I like it).

benjchristensen commented 10 years ago

@sirthias got it, and yes I agree with what you say. Thanks for clarifying. Whatever we decide here could be with us for a long while if we do it well, so understanding all use cases and keeping this design as simple and flexible as possible is very important.

we are saying that a Publisher is entirely free to do whatever it wants when it receives a subscription request:

@rkuhn Yes, I agree with this statement and think this is exactly what we need in something intended to be used by any and all implementations including (hopefully) use cases we can't think of right now.

the production of data streams is outside of the core scope of this specification, the purpose of which is to define how to transfer data across an asynchronous boundary

I like this focus on scope.

jbrisbin commented 10 years ago

I agree with the behavior @benjchristensen outlined here.

The more I've played with variations on this, the more convinced I am that as far as this and the TCK goes, the simpler we can keep it the better. Part of me wants to see direct support of callbacks (like Consumer) so we can use lambdas and method references natively, but the other part of me sees that library implementations will have to provide indirection to their own callback types anyway, so I suppose it's better to keep it simple and not support that directly here. It reduces a layer to worry about and each implementation will have their own incompatible callbacks anyway (Scala fn vs Groovy closure vs Java lambda, etc...).

mariusae commented 10 years ago

I like this. It seems hard to make it simpler.

In the interest of fully specifying, is a published allowed to send fewer than the requested events even if they are available (e.g. due to buffering concerns.) (“A Publisher will never send more onNext events than have been requested via the Subscription.request/signalDemand method.”)
signalAdditionalDemand implies that the demand is cumulative. Is this the right model? Having it be absolute would allow for more adaptive queuing — e.g. to pause the stream you can call signal(0); more generally you can apply adaptive policies (e.g. a computation may be more expensive than anticipated, so you could adjust demand accordingly.) Of course you could do the same by passing negative demand, but that isn’t specified here, and I’m not sure it’s the right API. On the other hand, making it cumulative does rule out potential disagreements about event count (e.g. subscriber vs. publisher) which could even be caused by race conditions, since control can happen from an outside thread.
Should rename cancel to close? It’s possible to tie resource lifetimes to subscriptions, so close seems more accurate.

—

Here’s one way I’ve been thinking about this problem:

public interface Source<T> {
    public Handle<T> open();
}

public interface Handle<T> {
    public void read(Reader<T> reader, int n);
    public void close();
}

public interface Reader<T> {
    public void onNext(T t);
    public void onCompleted();
    public void onError(Throwable t);
}

This follows the Unix IO model closely. The basic idea is that you explicitly read a batch at a time. Batches are allowed to be short. A batch of length zero (onCompleted() without an antecedent onNext()) means EOF.

Things like buffering (e.g. the use case where you “just want to keep going”, calling signalAdditionalDemand as you go along) are handled quite naturally by interposing, say, a BufferedSource.

Also, many of the semantic corners around demand management are answered a little more naturally by the interfaces — e.g. the fact that you are allowed only one outstanding read makes things pretty clear.

benjchristensen commented 10 years ago

Hi @mariusaeriksen, thanks for the great feedback and help on this.

is a published allowed to send fewer than the requested events even if they are available

I would say yes, it can send 0...requested. If it sends less and is finished then it must send onError or onCompleted.

signalAdditionalDemand implies that the demand is cumulative

Agreed it's a bad name. I've been avoiding the name bike shedding thus far so we can agree on functionality and contract and then debate names.

I have been thinking of this method as requesting "n more", so it is cumulative, but the number passed in is not cumulative.

Thus, using request as the name:

request(10)
request(20)
// this would cumulatively mean that up to 30 events can be sent

Source/Handle/Reader

I'm open to this naming convention.

the fact that you are allowed only one outstanding read makes things pretty clear

I've been going back and forth on this ... the thing is that because it's async, there is nothing preventing someone from calling read multiple times. If we were to enforce only having one outstanding read, it can be limiting to the ability to start the producer fetching more data as the consumer buffer is draining. For example, would it mean someone can't request more until they have received everything they previously requested?

mariusae commented 10 years ago

I've been going back and forth on this ... the thing is that because it's async, there is nothing preventing someone from calling read multiple times. If we were to enforce only having one outstanding read, it can be limiting to the ability to start the producer fetching more data as the consumer buffer is draining. For example, would it mean someone can't request more until they have received everything they previously requested?

Yes — but you compensate by buffering, and you can treat it as an orthogonal concern, e.g. by implementing a BufferingSource:

Source<T> source = ..;
Source<T> buffered = Source.buffered(source, 1024);

benjchristensen commented 10 years ago

Yes — but you compensate by buffering, and you can treat it as an orthogonal concern, e.g. by implementing a BufferingSource:

Okay, makes sense.

I'm also looking at Dart to ensure we learn from their experience: https://api.dartlang.org/apidocs/channels/stable/dartdoc-viewer/dart:async.Stream

They use 'listen' which seems more appropriate than 'read' in a push model.

What do you think of 'Source.listen(Listener l)'?

benjchristensen commented 10 years ago

I have submitted a pull request that I propose we merge to end this particular thread of discussion: https://github.com/reactive-streams/reactive-streams/pull/37

normanmaurer commented 10 years ago

@benjchristensen damn it ... it seems like I deleted your comment (summary) by mistake because clicking on the wrong button :( Can someone just copy and paste it from the emails ? Sorry for the mess... It seems like I can not "revert" my change .

normanmaurer commented 10 years ago

Anyway... I really like what @benjchristensen proposed in his summary and I think it makes the contract quite clear. So I would be quite happy to have the proposal merged in.

I also like the naming that @mariusaeriksen proposed here, but as the @benjchristensen pointed out it may be better to think about names after we agree on something.

Thanks again to all of you for the hard work on this as I could not keep up with all of it over the last weeks :(

viktorklang commented 10 years ago

Roland is mid air afaik and will reply as soon as he can.

jbrisbin commented 10 years ago

I'm actually not feeling the warm fuzzies about the alternative naming above. To me it reads very IO-oriented which I don't think fits every use case. Although I caveat that by saying: calling things "sources" and "sinks" does have its advantages and is the standardized terminology used in Spring XD. I'm flexible.

But I definitely vote for merging the simplification changes (and Gradle build? :D) @benjchristensen proposed so we can get to work using it.

tmontgomery commented 10 years ago

Looking over the summary that @benjchristensen made (before it disappeared), it seems to hang together quite well. There are some niggly bits with the terms, but the behavior works, I think.

It would be good to put the summary back up, if possible.

listen is a better term for anything push related. With associations for listener semantics.

jrudolph commented 10 years ago

Here's a resurrection of @benjchristensen's comment from github's email:

Since no responses over the weekend and this thread is very long and hard to read, I'd like to summarize and ask for folks to weigh in. The proposal is as follows:

Contract

Subscriber can be used once-and-only-once to subscribe to a Publisher.
a Subscription can be used once-and-only-once to represent a subscription by a Subscriber to a Publisher.
The Publisher.subscribe method can be called as many times as wanted as long as it is with a different Subscriber each time. It is up to the Publisher whether underlying streams are shared or not.
A Publisher can refuse subscriptions (calls to subscribe) if it is unable or unwilling to serve them (overwhelmed, fronting a single-use data sources, etc) and can do so by immediately calling Subscriber.onError on the Subscriber instance calling subscribe.
Events sent to a Subscriber can only be sent sequentially (no concurrent notifications).
Once an onComplete or onError is sent, no further events can be sent.
Once a Subscription is cancelled, the Publisher will stop sending events as soon as it can.
A Publisher will never send more onNext events than have been requested via the Subscription.request/signalDemand method.

Types

Naming of classes and methods are not part of what is being discussed here. That can be argued over after we agree upon behavior :-)

package org.reactivestreams;

public interface Publisher<T> {

    /**
     * Request {@link Subscription} to start streaming data.
     * <p>
     * This is a "factory method" and can be called multiple times, each time starting a new {@link Subscription}.
     * <p>
     * Each {@link Subscription} will work for only a single {@link Subscriber}.
     * <p>
     * A {@link Subscriber} should only subscribe once to a single {@link Publisher}.
     * 
     * @param s
     */
    public void subscribe(Subscriber<T> s);
}

package org.reactivestreams;

/**
 * A {@link Subscription} represents a one-to-one lifecycle of a {@link Subscriber} subscribing to a {@link Publisher}.
 * <p>
 * It can only be used once by a single {@link Subscriber}.
 * <p>
 * It is used to both signal desire for data and cancel demand (and allow resource cleanup).
 *
 */
public interface Subscription {
    /**
     * No events will be sent by a {@link Publisher} until demand is signalled via this method.
     * <p>
     * It can be called however often and whenever needed.
     * <p>
     * Whatever has been signalled can be sent by the {@link Publisher} so only signal demand for what can be safely handled.
     * 
     * @param n
     */
    public void signalAdditionalDemand(int n);

    /**
     * Request the {@link Publisher} to stop sending data and clean up resources.
     * <p>
     * Data may still be sent to meet previously signalled demand after calling cancel as this request is asynchronous.
     */
    public void cancel();
}

package org.reactivestreams;

/**
 * Will receive call to {@link #onSubscribe(Subscription)} once after passing an instance of {@link Subscriber} to {@link Publisher#subscribe(Subscriber)}.
 * <p>
 * No further notifications will be received until {@link Subscription#signalAdditionalDemand(int)} is called.
 * <p>
 * After signaling demand:
 * <ul>
 * <li>One or more invocations of {@link #onNext(Object)} up to the maximum number defined by {@link Subscription#signalAdditionalDemand(int)}</li>
 * <li>Single invocation of {@link #onError(Throwable)} or {@link #onCompleted()} which signals a terminal state after which no further events will be sent.
 * </ul>
 * <p>
 * Demand can be signalled via {@link Subscription#signalAdditionalDemand(int)} whenever the {@link Subscriber} instance is capable of handling more.
 *
 * @param <T>
 */
public interface Subscriber<T> {
    /**
     * Invoked after calling {@link Publisher#subscribe(Subscriber)}.
     * <p>
     * No data will start flowing until {@link Subscription#signalAdditionalDemand(int)} is invoked.
     * <p>
     * It is the resonsibility of this {@link Subscriber} instance to call {@link Subscription#signalAdditionalDemand(int)} whenever more data is wanted.
     * <p>
     * The {@link Publisher} will send notifications only in response to {@link Subscription#signalAdditionalDemand(int)}.
     * 
     * @param s
     *            {@link Subscription} that allows requesting data via {@link Subscription#signalAdditionalDemand(int)}
     */
    public void onSubscribe(Subscription s);

    /**
     * Data notification sent by the {@link Publisher} in response to requests to {@link Subscription#signalAdditionalDemand(int)}.
     * 
     * @param t
     */
    public void onNext(T t);

    /**
     * Failed terminal state.
     * <p>
     * No further events will be sent even if {@link Subscription#signalAdditionalDemand(int)} is invoked again.
     * 
     * @param t
     */
    public void onError(Throwable t);

    /**
     * Successful terminal state.
     * <p>
     * No further events will be sent even if {@link Subscription#signalAdditionalDemand(int)} is invoked again.
     */
    public void onCompleted();
}

normanmaurer commented 10 years ago

Thx

Am 22.04.2014 um 18:06 schrieb Johannes Rudolph notifications@github.com:

Here's a resurrection of @benjchristensen's comment from github's email:

Since no responses over the weekend and this thread is very long and hard to read, I'd like to summarize and ask for folks to weigh in. The proposal is as follows:

Contract

Subscriber can be used once-and-only-once to subscribe to a Publisher. a Subscription can be used once-and-only-once to represent a subscription by a Subscriber to a Publisher. The Publisher.subscribe method can be called as many times as wanted as long as it is with a different Subscriber each time. It is up to the Publisher whether underlying streams are shared or not. A Publisher can refuse subscriptions (calls to subscribe) if it is unable or unwilling to serve them (overwhelmed, fronting a single-use data sources, etc) and can do so by immediately calling Subscriber.onError on the Subscriber instance calling subscribe. Events sent to a Subscriber can only be sent sequentially (no concurrent notifications). Once an onComplete or onError is sent, no further events can be sent. Once a Subscription is cancelled, the Publisher will stop sending events as soon as it can. A Publisher will never send more onNext events than have been requested via the Subscription.request/signalDemand method. Types

Naming of classes and methods are not part of what is being discussed here. That can be argued over after we agree upon behavior :-)

package org.reactivestreams;

public interface Publisher {
/**
 * Request {@link Subscription} to start streaming data.
 * <p>
 * This is a "factory method" and can be called multiple times, each time starting a new {@link Subscription}.
 * <p>
 * Each {@link Subscription} will work for only a single {@link Subscriber}.
 * <p>
 * A {@link Subscriber} should only subscribe once to a single {@link Publisher}.
 * 
 * @param s
 */
public void subscribe(Subscriber<T> s);
} package org.reactivestreams;

/**

A {@link Subscription} represents a one-to-one lifecycle of a {@link Subscriber} subscribing to a {@link Publisher}.

It can only be used once by a single {@link Subscriber}.

It is used to both signal desire for data and cancel demand (and allow resource cleanup). / public interface Subscription { /

No events will be sent by a {@link Publisher} until demand is signalled via this method.

It can be called however often and whenever needed.

Whatever has been signalled can be sent by the {@link Publisher} so only signal demand for what can be safely handled.

@param n */ public void signalAdditionalDemand(int n);

/**

Request the {@link Publisher} to stop sending data and clean up resources.

Data may still be sent to meet previously signalled demand after calling cancel as this request is asynchronous. */ public void cancel(); } package org.reactivestreams;

/**

Will receive call to {@link #onSubscribe(Subscription)} once after passing an instance of {@link Subscriber} to {@link Publisher#subscribe(Subscriber)}.

No further notifications will be received until {@link Subscription#signalAdditionalDemand(int)} is called.

After signaling demand:

One or more invocations of {@link #onNext(Object)} up to the maximum number defined by {@link Subscription#signalAdditionalDemand(int)}

Single invocation of {@link #onError(Throwable)} or {@link #onCompleted()} which signals a terminal state after which no further events will be sent.

Demand can be signalled via {@link Subscription#signalAdditionalDemand(int)} whenever the {@link Subscriber} instance is capable of handling more. *

@param / public interface Subscriber { /*

Invoked after calling {@link Publisher#subscribe(Subscriber)}.

No data will start flowing until {@link Subscription#signalAdditionalDemand(int)} is invoked.

It is the resonsibility of this {@link Subscriber} instance to call {@link Subscription#signalAdditionalDemand(int)} whenever more data is wanted.

The {@link Publisher} will send notifications only in response to {@link Subscription#signalAdditionalDemand(int)}.

@param s

{@link Subscription} that allows requesting data via {@link Subscription#signalAdditionalDemand(int)} */ public void onSubscribe(Subscription s);

/**

Data notification sent by the {@link Publisher} in response to requests to {@link Subscription#signalAdditionalDemand(int)}.

@param t */ public void onNext(T t);

/**

Failed terminal state.

No further events will be sent even if {@link Subscription#signalAdditionalDemand(int)} is invoked again.

@param t */ public void onError(Throwable t);

/**

Successful terminal state.

No further events will be sent even if {@link Subscription#signalAdditionalDemand(int)} is invoked again. */ public void onCompleted(); } — Reply to this email directly or view it on GitHub.

benjchristensen commented 10 years ago

Thanks for your feedback @tmontgomery @jbrisbin @mariusaeriksen and @normanmaurer

Definitely there are things to continue debating (naming, etc). If you all agree (including @rkuhn once he's available again), I suggest we merge https://github.com/reactive-streams/reactive-streams/pull/37 and start new issues to discuss the next round of topics that build on top of what we've agreed upon so far.

benjchristensen commented 10 years ago

The pull request (https://github.com/reactive-streams/reactive-streams/pull/37) has progressed as a result of discussion between @rkuhn and I. If any of you have the time, the README is worthy of review and feedback by more than just the two of us.

alexandru commented 10 years ago

"Events sent to a Subscriber can only be sent sequentially (no concurrent notifications).

What does that mean? In an asynchronous context, you cannot guarantee that events won't reach the Subscriber concurrently, unless the Subscriber applies back-pressure by means of subscription.request(1) (e.g. acknowledgement), this being one of the problems that back-pressure is meant to solve. I also thought that this subscription.requestMore(n) as means for back-pressure was chosen precisely to allow concurrent onNext events, with the subscriber being responsible for synchronization.

"Calls from a Subscriber to Subscription such as Subscription.request(int n) must be dispatched asynchronously (separate thread, event loop, trampoline, etc) so as to not cause a StackOverflow since Subscriber.onNext -> Subscription.request -> Subscriber.onNext can recurse infinitely."

While I understand the need for this, maybe it's the Publisher that should ensure that dispatching the next onNext events following a subscription.request(n) happen asynchronously, because the Subscriber's implementation is more user-facing than the Publisher. Or maybe I don't understand the implications of that, just saying. Either way, this explanation was needed in the description, thanks for adding it.

Subscriber controlled queue bounds

What this section is basically saying is that the Publisher should respect subscription.request(n) and in case the source produces more data than the Subscriber can handle, then the Publisher must decide what to do with it. Is this right?

alexandru commented 10 years ago

Also related to this ... what happens if we have a Subscriber that an onNext like:

public void onNext(T elem) {
   process(elem);
   dispatchRequestEvent(100);
}

I don't see anything in the contract of the Subscriber that disallows this and things get more complicated if the Publisher is allowed to send onNext events following a subscription.request(n) event synchronously. Granted, we go back to that mention about the Publisher that should send events sequentially, which makes sense in this light. But it would be good for the Subscriber's contract if you could rely on the call to subscription.request(n) to not produce undesired effects ... like stack or buffer overflows, because the Subscriber is more user-facing and maybe the safe handling of the subscription.request(n) event should be the Publisher's responsibility.

TL;DR, I'd like to call subscription.request(n) directly from Subscriber.onNext :-) I.e. code like this makes a lot of sense to me:

public void onNext(T elem) {
   if (isStillValid(elem)) {
     process(elem);
     subscription.request(1); // acknowlegement, next please
  }
  else
     subscription.cancel(); // no longer interested, stop please
}

benjchristensen commented 10 years ago

The discussion for this is now at https://github.com/reactive-streams/reactive-streams/pull/41 where the README is being revised.

rkuhn commented 10 years ago

@alexandru please refer to #46 for further discussion on the asynchronous semantics of the interfaces.

Since this has been split up into multiple topics, and in the light of @benjchristensen’s latest comment, I therefore close this issue.

reactive-streams / reactive-streams-jvm

Multicast Requirement #19

Contract

Types