reactive-streams / reactive-streams-jvm

Reactive Streams Specification for the JVM
http://www.reactive-streams.org/
MIT No Attribution
4.81k stars 531 forks source link

Second Subscription rule #480

Open olotenko opened 4 years ago

olotenko commented 4 years ago

https://github.com/reactive-streams/reactive-streams-jvm/blob/master/tck/src/main/java/org/reactivestreams/tck/SubscriberWhiteboxVerification.java#L216-L219

This rule and this test make no sense.

Publisher is allowed to observe cancellation "eventually". Publisher is allowed to call onError or onComplete without receiving request. on* methods do not distinguish which Subscripiton it is for. So if the Subscriber receives two onSubscribe, it may still receive various on* that will break the specification. (One Publisher does onComplete another Publisher does onNext)

In other words, there is no defence against non-conformant Publisher, or the systems that permit such a racy subscription to happen.

akarnokd commented 4 years ago

First of all, Rule 1.3 requires serial signaling thus if a Subscriber is reused, the second Publisher is not allowed to signal until the onSubscribe call it issued has returned. Yes, a Publisher can ignore the cancellation and still signal onComplete - there is no defense for that in this setup. Now imagine there was a cancel confirmation happening anyway...

The test is there to verify if the proper cancellation is happening for at least the non-racing scenario and helps detecting certain type of reuse bugs.

Sometimes, you can safely reuse the same Subscriber, for example, in concatMap where the you can reset the internals in onComplete to receive a fresh onSubscribe. For flatMap though, you need individual inner Subscribers.

olotenko commented 4 years ago

I read 1.3 as pertaining to a single Publisher, not all Publisher s past, concurrent, and future. There is no way to enforce anything else, because the state transitions are internalized.

I agree it may be useful to signal cancel. I disagree that all Subscriber must do that extra check. Say, I can prove my inner Subscriber does not escape to the outside world, and is passed to a single Publisher at a time. I see no good reason to require such a Subscriber to fence itself off against non-conformant Publisher s or attempt to detect multiple concurrent invocations of onSubscribe.

akarnokd commented 4 years ago

I read your comments on that flatMap commit and not sure what the problem is.

Is it that the inner subscriber is failing the TCK because of it implements a relaxed ruleset? Just don't test it with the TCK then.

Is it trust issues regarding Flow.Publishers violating the spec? You have to decide to either trust all, distrust all or trust only your own implementations.

All in all the TCK lets you verify conformant behavior in a limited way and I don't think it should be removed just because someone's use case doesn't fit it.

olotenko commented 4 years ago

The inner subscriber can be written without that check. The runtime overhead is low, but the reasoning is harder when there are more things to think about.

The issue is that that rule does not improve Subscriber s life even if it implements it.

akarnokd commented 4 years ago

Then ignore it if you still see value in the other verification methods.

akarnokd commented 4 years ago

Oh, I see why the concern. Some of your project's own code wouldn't pass the TCK.

olotenko commented 4 years ago

There are several things that are done incorrectly in that project. But getting it to a better level requires a sensible set of rules.

akarnokd commented 4 years ago

The rules have been fine, although not optimal, since the release of the spec/api/TCK. Some properties of the spec/TCK can be worked around just as fine.

So what action do you think reactive-streams-jvm should take regarding your issue?

(On a side-note, I'd really like to know, were these free and voluntary contributions or were they written by people paid to be working on the project?)

olotenko commented 4 years ago

Regarding the legal status of the contributions to that other project - you better contact the owners of the project.

I think it would be satisfactory to downgrade the MUST to SHOULD or something. On our side we can pick the tests that we like, or implement the strict check, it is not that expensive, just quirky and not much help.

viktorklang commented 4 years ago

Given that it is illegal by spec to attach the same Subscriber instance twice, that is already a spec violation in itself. Worth remembering is that the spec is between unknown Publisher-Subscriber combinations, if your Publisher implementation can do things differently by checking the runtime type of the Subscriber, then you are free to do as you want, as long as the different behavior cannot be observed when attaching a generic Subscriber.

olotenko commented 4 years ago

It is ok to require that no one expects arbitrary Publisher s and arbitrary Subscriber s to be reusable. Requiring that all Subscriber s disallow reuse (under the pain of being branded as the spec violators), is backwards.

viktorklang commented 4 years ago

@olotenko Publishers are reusable by spec. Regarding Subscribers, if you know that a Subscriber is reusable, and you can clearly tell when it is safe to be reused, then you are of course free to do so. This is the same situation as with Iterators, only when knowing strictly more than simple Iterator is it safe to call next once hasNext has returned false. (for instance, you might have some reset method on it)

olotenko commented 4 years ago

Sure. But you see the point, right? That the spec says "for all", whereas it should say "exists".

rkuhn commented 4 years ago

I can see what you mean when expanding the scope to all uses of these interfaces. The intended scope, as documented in the preamble, is to define behavior for the interaction of Publishers and Subscribers across boundaries, both async and organizational; the main goal is interoperability. This goal is only achieved by saying “all conforming implementations do X”, otherwise one implementor cannot rely on the behavior to be displayed by another implementation when plugged together by a third party.

olotenko commented 4 years ago

The flaw in reasoning is that "all conforming implementations do X" does not mean every "do" applies to every "X".

Eg if you reword the requirement as "all conforming Publisher s must handle the case of Subscriber cancelling the Subscription when more than one onSubscribe is invoked", it becomes moot, because conforming Publisher s don't do more than one onSuscribe. Also, "exists" covers this case perfectly well ("there exist Subscriber s that will cancel Subscription upon receipt of the second onSubscribe" is enough to enforce conformant behaviour in conformant Publisher)

Instead, the spec is defining a required handling of non-conforming Publisher s by all Subscriber s.

viktorklang commented 4 years ago

@olotenko The reason for the symmetric check is to have a better chance of detecting violations on both sides of the fence (Publisher or Subscriber). If we were to rely on everything always doing the right thing, no TCK would ever be needed, because everyone would do what is expected of them, at all time. My experience in the software industry tells me that bugs and misinterpretation is commonplace, and therefor the TCK tries to do its best to avoid both of them. At least that's the rationale behind my reasoning.

olotenko commented 4 years ago

@viktorklang :) Subscription.cancel() is in the future from the invocation of onSubscribe. How far in the future? Can it be after onSubscribe returns? Can I never return from the second onSubscribe? This quickly spins out of the plane of reasonable conformant implementations.

viktorklang commented 4 years ago

@olotenko From the perspective of the Subscriber, it is already cancelled. So I'm not sure how that matters? A generic Subscriber should never receive two onSubscribe calls.

olotenko commented 4 years ago

@viktorklang :) in the context of cancelling the second Subscription? Of course, it matters! That's the whole point of my argument!

Say, normally a Subscriber has only one Subscription. When the Subscription is cancelled, all Subscription s are cancelled (because there is only one).

Abnormally the Subscriber receives the second Subscription. Even if it gets cancelled, not all Subscription s are cancelled. The Subscriber does not enter the same state of "I'll ignore everything now", and there is no telling on behalf of which Subscription any future on* events arrive - so you can't ignore selectively.

There is no defence against misbehaving Publisher or misconstructed pipeline.

Add to that the conforming Publisher is required to signal onError when it receives a subscribe call that it cannot serve. So, it supports only one Subscriber, the broken pipeline subscribe s a second Subscriber, it is required to call onSubscribe by the specification of subscribe, and then call onError.

DougLea commented 4 years ago

Back to the original question. @olotenko points out that the tck test is too strict. The spec implies "eventually", but the check is immediate. This is a classic testing issue; the usual compromise is to wait (sleep) long enough that "eventually" is sure to have happened in the testing context. Here, maybe a few seconds.

olotenko commented 4 years ago

I imply more than that. Not calling request guarantees a conformant Publisher that happens to be in the mix will not call spurious onNext. Calling cancel does not guarantee anything even from a conformant Publisher. Why require calling cancel?

viktorklang commented 4 years ago

@DougLea Adding some sort of polling cycle until a timeout is definitely possible.