reactive-streams / reactive-streams-jvm

Reactive Streams Specification for the JVM
http://www.reactive-streams.org/
MIT No Attribution
4.81k stars 530 forks source link

Clarification request for Rule 2.7 #405

Closed pavelrappo closed 6 years ago

pavelrappo commented 7 years ago

A Subscriber MUST ensure that all calls on its Subscription take place from the same thread or provide for respective external synchronization.

Given rule 1.3, technically, which thread a subscriber's method is called in is out of this subscriber's control. As far as I understand most often Subscription's methods are called from Subscriber's methods synchronously.

Thus, either all calls to Subscription should be properly synchronized all the time (but that's not what we probably want) or the rule should be rephrased.

(Sorry if it looks like nitpicking.)

rkuhn commented 6 years ago

@thekalinga You are right in that there is nothing forcing the spec to contain rule 2.7, that rule is a design choice (just like all the others). The justification is exactly what you already confirmed, namely that Reactive Streams places the burden of synchronization on the caller of each callback, be that the Subscriber methods or the Subscription methods (cancel is a bit different since it is best effort anyway). With this scheme it becomes practical to implement very lightweight (single-thread) Publishers and Subscribers while still maintaining full compatibility with heavyweight (multi-thread) implementations. The choice is thus more social than technical—just like the origin of the whole standard.

BTW: can you elaborate on the need to implement Subscriber for any non-trivial activity? RxJava, Reactor, Akka Streams et al all contain very powerful operators that allow the recipient of a stream to do quite non-trivial things without every needing to implement a Subscriber by hand.

thekalinga commented 6 years ago

@rkuhn

Thanks for your response. I agree with you only partially. Lemme justify my reasoning (Please correct me if I'm wrong in any of my analysis below)

We have rule 1.3 that refers to contract for invocations on Subscriber & rule 2.7 that refers to contract for invocations on Subscription

Rule 1.3: onSubscribeonNextonError and onComplete signaled to a Subscriber MUST be signaled in a thread-safemanner—and if performed by multiple threads—use external synchronization.

Rule 2.7 A Subscriber MUST ensure that all calls on its Subscription take place from the same thread or provide for respective external synchronization.

My assumption is that reason for the existence of rule 1.3 is to make the life easy for Subscribers who are usually not that well versed in multi threading (people like me) when compared to the developers of Publishers & Processors (+ Subscribers) who knows what they are doing in complex multi threaded environment. With rule 1.3, people like me gets a simple guarantee. Irrespective of how the upstream is implemented, what ever the state changes I had in my Subscriber object at the end of previous downstream signal processing (from the point of view of Publisher) will still be seen by Subscriber at the start of next downstream signal

So from my point of view (lay developer) its a justified development pain (& performance cost due to additional synchornization) for framework developers to implement rule 1.3. They are making my life simple with this

But rule 2.7 inverts the above responsibility & expects the lay developer to not only worry about the state in his own Subscriber but also about the upstream (Subscription/Publisher) state. So from this point of view, linking Subscriber to Subscription state is not justified (unless there is really sound reasoning, which I have not yet seen)

BTW: can you elaborate on the need to implement Subscriber for any non-trivial activity? RxJava, Reactor, Akka Streams et al all contain very powerful operators that allow the recipient of a stream to do quite non-trivial things without every needing to implement a Subscriber by hand.

For eg., To achieve back pressure in a Subscriber who does simple background job (Lemme use Reactor as an example)

I cant use Flux.subscribe(onNext, onError?, onComplete?) as it requests Long.MAX_VALUE. The other alternative method Flux.subscribe(onNext, onError?, onComplete?, onSubscribe?)

But the implementation of this is a direct pass thru for Subscription calls (AFAIK, rule 2.7 is not taken care of within framework. I believe must be for performance reason)

https://github.com/reactor/reactor-core/blob/master/reactor-core/src/main/java/reactor/core/publisher/Flux.java#L7688-L7696

https://github.com/reactor/reactor-core/blob/master/reactor-core/src/main/java/reactor/core/publisher/LambdaSubscriber.java#L80

So if I write my own Subscriber that uses Flux.subscribe(Subscriber), reactor wraps my Subscriber in StrictSubscriber

https://github.com/reactor/reactor-core/blob/master/reactor-core/src/main/java/reactor/core/publisher/Flux.java#L7700

https://github.com/reactor/reactor-core/blob/master/reactor-core/src/main/java/reactor/core/publisher/Operators.java#L1052

If I'm not mistaken, even StrictSubscriber does not take care of rule 2.7 aswell automatically inside the framework (I assume for again for performance reasons)

https://github.com/reactor/reactor-core/blob/5fa95df3f433cf388db65839557cda9dc548b629/reactor-core/src/main/java/reactor/core/publisher/StrictSubscriber.java#L128-L150

So 2.7 forces users like me to worry about extraneous things without justifying itself fully. AFAIK, the current form of rule 2.7 exists the way it is only because upstream Publisher has not added thread safety guarantees inside his Subscription implementation & needs this guarantee from every Subscriber, i.e from those implementations that are not in his control

PS: Please note that I have limited understanding of reactor & I dont know if there are better ways to achieve the same back pressure other than these three methods

rkuhn commented 6 years ago

@thekalinga Thanks for elaborating on your understanding of the roles of Publisher and Subscriber, this is the crucial point. Your desire to question rule 2.7 stems from the view that Subscriber is meant to be implemented by end-users while Publisher (and thus Subscription) is not. This is not correct! Subscriber is intended for the very same audience as Publisher, namely for library authors exclusively.

This basic assumption lies at the bottom of this thread, please confirm that we can close it: Reactive Streams will not be changed to target end users as far as I am aware.

PS: my understanding of Reactor is also limited, in Akka Streams you would perform your extensive background job wrapped in a CompletionStage and wire that into your stream using .mapAsync with parallelism 1.

thekalinga commented 6 years ago

@rkuhn

This is not correct! Subscriber is intended for the very same audience as Publisher, namely for library authors exclusively.

Ok. End users like me will struggle with this rule then

This basic assumption lies at the bottom of this thread, please confirm that we can close it: Reactive Streams will not be changed to target end users as far as I am aware.

If its set in stone, it can be closed

in Akka Streams you would perform your extensive background job wrapped in a CompletionStage and wire that into your stream using .mapAsync with parallelism 1.

I'm completely clueless about Akka. Can you please share pseudo code in Akka (may its doable in reactor too similarly)

rkuhn commented 6 years ago

let’s move the Akka question to gitter or discourse

thekalinga commented 6 years ago

@rkuhn Now that you mentioned its only for library developers, can you briefly mention whats the justification of 1.3 & 2.7 from the point of view of library developers? (as my asssumptions are obviously incorrect & the rules/thier explanation does not say why these should be followed)

Please note that, For now, I need this information for training people. May be in the future if I become better at my skills in the multi threaded world, I might become a contributor to one of the opensource library