3.x: Split the library into two or adding types for symmetry

JakeWharton commented 8 years ago

It may be far too late in the release cycle for this, but in writing an outline for a presentation on RxJava 2 for the last month I've come to think we're erroneously shipping two libraries as one.

RxJava 2 currently has two nearly disjoint pieces:

Flowable and FlowableProcessor are a Reactive Streams Publisher and Processor stream type, respectively, which use RS types for subscribing and RS types to control backpressure and unsubscribe notification.
Observable, Maybe, Single, Completable, and Subject are custom stream types which use custom types for subscribing, do not have backpressure, and use a custom type for unsubscribe notification.

Aside from explicit conversion functions between the two, these types do not interact. Observable will soon be retrofitted to return more-specific types for certain operators. Flowable also will receive (and already has) some of these more-specific types for certain operators as well, but unlike their enclosing type they do not support backpressure.

When you look at a high-level overview of the library like I have before for normalizing naming, you can clearly see there is a divide.

This divide seems to be rooted in the fact that there's three things pulling RxJava 2 in different directions:

Backpressure support was added late to RxJava 1 which resulted in all operators not implementing it. The built-in factories made it harder than it should have been to create backpressure-aware observables around non-Rx sources. This caused MBEs to happen to a lot of people and the desire for non-backpressure types.
The RS spec is to be implemented natively for backpressure-aware types.
People use and enjoy the four specialized RxJava 1 types: Observable, Single, Completable, and Subject, and want even more: Maybe.

The first two of these are conflicting which is not entirely terrible. If you ignore the third item you'd get four types which we have now: Flowable, Observable, FlowableProcessor, Subject. The third item, however, starts to cause the combinatorial explosion of types.

This is what leads me to believe there are two libraries hiding inside RxJava 2 that, while related, aren't the same:

Flowable, Maybe-like, Single-like, Completable-like, and FlowableProcessor Reactive Streams Publisher types.
Observable, Maybe, Single, Completable, and Subject generic non-backpressure stream types.

As far as I can tell there's three options:

Do nothing. Ship 5 backpressure-free custom types and 2 backpressure-aware RS types in one library with built-in conversion methods across the divide. Live with the fact that the types are asymmetrical and some operators on Flowable do not support backpressure.
Split the library into two. They could either live inside this repository (rxjava-rs and rxjava-nbp) plus an adapter library (rxjava-bridge) for use with to(), or they could be separate and developed/released independently. This doesn't immediately require extra types on the RS side for symmetry. They could be added as needed post-2.0.0.
Add missing types for symmetry between the two inside this one library. This might cause @akarnokd to go crazy because it's non-trivial and few else are qualified and skilled enough to do all the work. Because of the desire to customize the return types of operators it's hard to defer this to post-2.0.0 since it would break compatibility.

I'm curious to hear what others think about this. I'm sorry it didn't dawn on me as a problem sooner.

A final piece of food for thought: if you were implementing RxJava 2 as a brand new library from scratch without the historical context of RxJava 1, what would you want in it?

Stream types that implement the Reactive Streams spec.
Stream types that do not support backpressure.
Customized stream types which model subsets of the event/notification lifecycle.
Operators returning customized stream types with the correct backpressure-aware/free context.

Ideally what we ship in RxJava 2 is exactly and only the answers to that question.

smaldini commented 8 years ago

But wouldn't that be Reactor 3.0.2.RELEASE in the end ? or reactor-core-java6 precisely ?

I don't mind much - We would love to support/host such effort if necessary as we do for RSC. I understand the concern and that was one of the original ideas poured into Reactor 2+ to provide the Publisher-based alternative to Rx (while not being married to Rx everywhere and focusing more on constructs to support Publisher).

Although it seems its quite late in release plan to do such dramatic changes but since we have the chance to be quite close technically and people-wise, I thought that was an opportunity to discuss it. Besides we do provide prime support for RxJava1/2 as well without extra bridge dependency.

JakeWharton commented 8 years ago

No, as this project supports Java 6 for Android. If Reactor was Java 6 I'd be perfectly happy with RxJava being only NBP types.

edit: Ah, I see you wrote more after the initial email. Frankly it'd be great to not be duplicating effort between the two libraries. The Java 8 requirement of Reactor is at present an insurmountable hurdle for Android developers. That said I think you're right that it would be irresponsible to completely ignore Reactor's contribution when thinking about any changes we might make here.

On Mon, Sep 19, 2016, 4:00 AM Stephane Maldini notifications@github.com wrote:

But would that be just Reactor 3.0.2.RELEASE in the end ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ReactiveX/RxJava/issues/4564#issuecomment-247931648, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEEEfyxhAfQol_GMS_0SeBnhcgY44sJks5qrkErgaJpZM4KAJuY .

smaldini commented 8 years ago

Thanks Jake, we just pursue completeness vs overlapness IMO. That java version is a really annoying thing and I even wonder if we should backport reactor-core 3 for jdk6 e.g. in https://github.com/reactor/reactor-core-jdk6. You could just let RxJava live with the same constructs than rxjava 1 while being massively improved, removing eventually Flowable & Processor from the scope (not sure about Maybe).

I'm not familiar enough with the use of Rx in android world to make a case for RS that much. On the server side we have a few important use cases including the so-called Microservice architecture and higher concurrent CPU number starving. So it seems to me that java 6 based Android wouldn't make that much use yet of an RS based implementation given the hardware they would likely run on and their use case of being consumer-only apps.

JakeWharton commented 8 years ago

With regards to the Java 6 backport, it needs to be treated like a separate library (i.e., separate maven coordinates and no conflicting package names). This means that if it were used by a library and a consumer pulled in both the Java 8 version and the Java 6 version on the same classpath it wouldn't break. This was brought up as part of the discussion that led to RxJava 2.x having Java 6 support.

I don't recommend that Reactor backports to Java 6. This seems like it would only result in even more duplicated effort between the two libraries.

Instead, we could add an RS-implementing Mono (or whatever name people want) to RxJava 2. This would allow most narrowing operators on Flowable (like last()) to return backpressure-aware instances since it can stand in for Completable and Maybe and mostly Single.

The lack of symmetry between the backpressure-aware and backpressure-free types would still bother me a ton, but at least there's less of a compromise.

JakeWharton commented 8 years ago

And if we want to reduce cognitive overhead of adding a new type, Completable could be removed in favor of Observable<Void> since that is now a guaranteed zero-element stream with null events being forbidden.

(As far as symmetry is concerned that still leaves out a backpressure-aware Maybe equivalent, though.)

smaldini commented 8 years ago

Adding symmetry at this stage and removing Completable would just go toward having a Flux and a Mono at some point. In that case I don't mind supporting a java 6 backport with whatever package prefix, it's not that difficult a task and I've started local changes. Without removing Completable, it's slightly more noisy but still it will be difficult for library implementors (e.g. Spring exposing RxJava API) to actually support that many types. There is a risk that subtle difference between Observable and Flowable, or Maybe and MaybeFlowable will remove some user-friendliness in the library by adding that choice.

Eventually it seems the best outcome would just be to not include Flowable in the lib since java 6 android environment will unlikely make a case for RS backpressure.

JakeWharton commented 8 years ago

Dropping Flowable (and the RS spec with it) certainly makes RxJava 2 look more logical in its offerings, but it would be weird to have RxJava participate in defining the RS spec only to abandon it. I'd also be curious what the Netflix stakeholders (current and past) think about this issue, and if they plan to use RxJava 2 or Reactor 3 on the server-side.

All I know that if I came in to RxJava 2 and reactive programming in general for the first time I would wonder why there was only one RS stream type (ignoring processors) and four backpressure-free types let alone why operators applied to backpressure-aware types may return backpressure-free types. RxJava and reactive programming is a hard enough mental model to become proficient in and asymmetry will only add to that. The architecture and API improvements are massive wins in 2.x, but I don't want this asymmetry to become a thing that misses the boat and has to wait until a hypothetical 3.x.

smaldini commented 8 years ago

Keep in mind that one of my arguments was also why a library would include two duplicate API. Adding Symmetry is just going to reinforce that. Also what's the limit then in terms of jar size and other concerns you have regarding Android.

If I may, regardless of being weird to not have Publisher implementations you have a mass majority user base on Android you support. If I look from this perspective, not having Flowable/FlowableProcessor etc brings to the lib :

No extra reactive-streams.jar dependency
No confusion over the semantics or use case, especially in Android world.
Reduced JAR size / focused API scope
RxJS and Rx.net don't have such concept anyway
Will probably be more efficient even than Reactor

It's a difficult decision and we have to come across many of them in the Spring team too. On the server-side not only reactor and rx are available, but also plain Akka Stream, guava, RSC itself etc. In Spring ecosystem (from data to security) we are using Reactor 3 and adapt to Rx if the type signatures ask so, using the adapters which support cross-library optimizations and fusion thanks to @akarnokd.

Akka Stream actually uses RS contract at its boundaries not at every stage as well. You could envision a bridge strategy (not providing Flowable but toPublisher extractors, eventually using an optional/provided dependency on RS). This way you could even envision an adapter to Java 9 as well.

Alternatively you maintain a status-quo and just polish the lib as it is right now, no more types or duplication.

davidmoten commented 8 years ago

Eventually it seems the best outcome would just be to not include Flowable in the lib since java 6 android environment will unlikely make a case for RS backpressure.

I'm an infrequent visitor to the Android development experience but I still think there is a case for RS backpressure there. Cpu intensive tasks (that could cause buffer bloat) may be less commonly implemented on Android but RS backpressure is still useful for limiting IO operations. An example might be zipping two asynchronous streams together that both perform IO operations but one returns much quicker. Without backpressure the quick service is run many times more than needed and its results buffered. So we get memory space problems and cost the platform battery power making unnecessary network calls over a mobile network. This is your space @JakeWharton, what do you think?

smaldini commented 8 years ago

@davidmoten Unfortunately the RS implementation doesn't come for free neither and has a more complex path route. So it translates directly in being less efficient than Observable overall. That's why some people also advocate for no BP handling at all, that sticks with mechanical sympathy principles. Such IO scenario from a consumer perspective are more akin to scheduling and I'm pretty sure you would prefer less buffering overhead in that perspective or explicit ones such as cache()/replay() rather than risking your data in memory for an undefined period. I'm not sure what would qualify as CPU intensive tasks in Android (on java 6 environment hardwares at least). These phones have at most 2-4 CPU + limited memory and mainly translate backend into UI experience in microservice world. If you had to combine many IO calls together (usually HTTP, so Single like flows) you would anyway benefit from a smooth deferring via the on demand subscribe calls at the merging point.

akarnokd commented 8 years ago

With the 5 base types RxJava 2.x became bloated - many seem to want its own custom-tailored reactive type/library. Flowable is quite capable of handling all features of the other 4 and I never liked the "let's capture the cardinality in the base type" attitude (this is why I don't like reactor.core.publisher.Mono either). For a small user convenience mostly, I have to work 5 times over.

I'm also leaning towards splitting RxJava 2.x into two - one library for the non-backpressured components and one for the backpressured Flowable. Note however that there are shared components between the two: Disposables, Schedulers, functional interfaces that have to go into one and get referenced from the other or have them as an external library.

Otherwise, the split would add development overhead because most things would still have to be done 5 times over but now at least in 2 projects with double the GitHub time. In addition, what should be the main RxJava library; Observable and co or Flowable and co?

(Other complications involve appealing for a new project ReactiveX, setting up separate auto-release, migrating issues, redirecting people, re-educate people of the split to name a few.)

LalitMaganti commented 8 years ago

Forgive me if the rest of my post misses something obvious but here we go: So I've been thinking about the vast amount of duplication between the base types as well as the near identical nature of Flowable and Observable apart from the vital difference of backpressure. This got me thinking a bit more about the nature of backpressure and specifically toFlowable/onBackpressureXXX operators.

Backpressure only occurs when a thread switch happens. This is obvious but crucial. If we're on the same thread, data can only inherently be received as fast as processed so there is no need for explicit backpressuring. Keeping this mind, the question rises: why are we allowed the call these methods wherever in the chain? It does not seem to make much sense if you think about it. While it can be powerful to switch between Observable/Flowable or backpressure wherever we want, it adds a cognitive overhead as well.

Therefore, in the spirit of dramatic changes being proposed in this thread, I ask for the consideration of the idea of removal of the concept of backpressure from anywhere else but at thread-switch operators (subscribeOn/observeOn/timer etc.) - essentially any operator which takes in a scheduler this applies to. An idea might be to introduce the concept of a Scheduler/BackpressureMode pair (a kind of intrinsically backpressured scheduler) which handles the backpressure for the thread switch.

Moreover, with the issue of combinatorial explosion of operators: @akarnokd would it not be possible to combine Single/Maybe/Observable observers into one Observer type? - i.e. the operator's observer would implement all three of these interfaces at the same time. This should cut down on the amount of duplication I feel as much of the methods can be reused inside these classes.

akarnokd commented 8 years ago

@tilal6991 You mixed up a few concepts.

Backpressure is an architectural property and there are operators that don't really care about backpressure (such as map) and there are others who care (observeOn, zip, flatMap, etc.). For example, if you have range(1, 1_000_000).map(v -> 2 * v) you don't really need backpressure but if you have range(1, 1_000_000).map(v -> 2 * v).observeOn(Schedulers.computation()) the flow control has to reach range so it doesn't generate all 1M elements and fill observeOn's buffer. Going from non-backpressured to backpressured requires handling the overflow and thus switching between the two modes is expensive and/or lossy.

combine Single/Maybe/Observable observers into one Observer type?

People are too fond of the onSuccess on one hand and there are event-overlap problems in the protocol between Single-Maybe-Completable. Besides, Observable is multivalued and is incompatible with the concepts of the previous 3 and can't work inside operators. Combining them all is only reasonable as some very end-consumer with disregards to overhead such as TestObserver which already implements all observer types at once.

The ultimate solution comes when Java introduces extension methods similar to Kotlin/C# and you can have as many types and method in as many libraries as you want yet still able to work together on a common set of base interfaces (a la Reactive-Streams).

LalitMaganti commented 8 years ago

Also I think I misused the word "backpressure". Flow control is probably a most appropriate choice and you're right that what I proposed would not work in all cases.

In any case, I do feel the current situation of simply copying tens of operator 5 times is not sustainable. Some sort of solution needs to be worked out there :/

zsxwing commented 8 years ago

For splitting into 2 projects, please also take into account other language adapters (Scala, Groovy, Clojure and Kotlin). Do they need to be split as well?

LalitMaganti commented 8 years ago

I have another crazy sounding idea which again please ignore if I've missed something obvious: To fix the duplication problem, what about if Single, Maybe and Completeable inherit from Observable? Each one of them is essentially a specialization of Observable right? This would allow for deduplication of code between the four as you could just write the operator for Observable and have the other implementations for free.

JakeWharton commented 8 years ago

I have prototyped this and it requires at least two drastic changes:

making operator methods non-final to allow overriding for usage of covariant return types to specialize in subclasses
renaming of any combination operator name to avoid erasure collisions (e.g., flatMapObservable, flatMapSingle)

The huge upside is that you can treat any Single like an Observable which reduces the need for a lot of the specialized operators while still keeping only-once semantics where needed.

JakeWharton commented 8 years ago

More importantly, that a completely orthogonal proposal that isn't related to the larger issue here. If you want to propose the use of inheritance, please do so in a separate issue.

LalitMaganti commented 8 years ago

Ah yes apologies. Started of a response to @akarnokd's last comment but getting off topic at this point.

akarnokd commented 8 years ago

Single, Maybe and Completeable inherit from Observable making operator methods non-final to allow overriding for usage of covariant return types to specialize in subclasses

Many operations don't make sense outside the 0..N Observable

sounds like Reactor's Mono without backpressure

yes, you can reuse operator internals from Observable but lose the optimization potential knowing that only one of the onNext, onError or onComplete is ever called

won't work with the current architecture and would still require duplicate reactive base-class shells around: ObservableMap, SingleMap.

non-final methods may lead to megamorphic dispatch

no more onSuccess, has to call onNext followed by onComplete for Single and Maybe.

LalitMaganti commented 8 years ago

Please see https://github.com/ReactiveX/RxJava/issues/4584 for a continuation of discussion of merging of functionality across the NBP types.

imperatorx commented 8 years ago

Just a user suggestion:

All operators of Flowable should return Flowable to stay in a pure flow-controlled world and avoid confusion
Other non-backpressured types (perhaps separated to a different maven artifact) should have static factory methods for converting from Publisher/Flowable

akarnokd commented 8 years ago

All operators of Flowable should return Flowable to stay in a pure flow-controlled world and avoid confusion.

I'm getting similar feedback from RS experts. Its technically simple as the original code is till there. I'm not sure about the stakeholders such as @abersnaze's original use cases but the conversion is possible because the upstream can be consumed in an unbounded fashion and may not need backpressure at all.

abersnaze commented 8 years ago

I don't think of Single/Maybe/Completable as not being with either Flowable or Observable. It all started because there wasn't a way for APIs that return Observables to express how many items it would onNext.

The dual of All operators of Flowable should return Flowable is All operators on a List should return a List. Would it make sense for List.size() to return List<Integer> rather than just an int?

ScottPierce commented 8 years ago

I was about to file a similar issue before I found this thread.

I'm coming at this entirely from the perspective of Android. RxJava has become very popular on the Android platform. I see more and more resumes with RxJava experience listed, and I think this library matters a great deal to the Android community.

When developing on android APK size and method count matter. There are many libraries myself or my team would like to use, but we don't simply because they are too large in size, or their method count overhead is too high. The benefits of keeping an APK size low are fairly obvious. What I'm slightly more concerned with with in RxJava 2.0 is it's method count. Android apps can have a maximum of 65536 methods before having to use something called Multi-dex which can dramatically impact compile time, and app start-up time (especially on pre 5.0 devices).

My concern is that RxJava's method count has slowly grown over time, and rxjava 2.0.0 is another significant jump in method count and overall size:

RxJava Version	Jar Size	Method Count
1.0.0	674 KB	3,339
1.2.1	1,152 KB	5,581
2.0.0-RC3	1,894 KB	8,931

I compiled a very basic Android App with almost no application code. All it does is include the latest appcompat-v7 library (pretty standard for any Android app), and the specified version of RxJava.

RxJava Version	APK Size	Method Count
none	1,367 KB	16,937
1.0.0	1,573 KB	20,276
1.2.1	1,739 KB	22,518
2.0.0-rc3	1,936 KB	25,868

With RxJava 2.0.0 imposing an almost 9k method overhead, I do think that it's gotten to a place where splitting up the library into multiple parts would be beneficial. There are portions of the library I don't think my team would use at all, and being able to pick and choose what parts of the library we include in our app would help the android community to still use RxJava without feeling like there is a tradeoff.

In a perfect world, my team would probably only include Observable and Single to our applications. If Completable had less than 1000 methods overhead we'd consider it, but otherwise we'd just get by using Single<Boolean> as we have in the past. We haven't found a use for backpressure yet, so we wouldn't consider Flowable unless we discovered some really handy means of using it.

JakeWharton commented 8 years ago

RxJava ProGuards really well. I'm not concerned about method count as a supporting argument for any of my suggestions in this issue.

On Wed, Oct 5, 2016, 12:57 PM Scott Pierce notifications@github.com wrote:

I was about to file a similar issue before I found this thread.

I'm coming at this entirely from the perspective of Android. RxJava has become very popular on the Android platform. I see more and more resumes with RxJava experience listed, and I think this library matters a great deal to the Android community.

When developing on android APK size and method count matter. There are many libraries myself or my team would like to use, but we don't simply because they are too large in size, or their method count overhead is too high. The benefits of keeping an APK size low are fairly obvious. What I'm slightly more concerned with with in RxJava 2.0 is it's method count. Android apps can have a maximum of 65536 methods before having to use something called Multi-dex https://developer.android.com/studio/build/multidex.html#mdex-pre-l which can dramatically impact compile time, and app start-up time (especially on pre 5.0 devices).

My concern is that RxJava's method count has slowly grown over time, and rxjava 2.0.0 is another significant jump in method count and overall size: RxJava Version Jar Size Method Count 1.0.0 674 KB 3,339 1.2.1 1,152 KB 5,581 2.0.0-RC3 1,894 KB 8,931

I compiled a very basic Android App with almost no application code. All it does is include the latest appcompat-v7 library (pretty standard for any Android app), and the specified version of RxJava. RxJava Version APK Size Method Count none 1,367 KB 16,937 1.0.0 1,573 KB 20,276 1.2.1 1,739 KB 22,518 2.0.0-rc3 1,936 KB 25,868

With RxJava 2.0.0 imposing an almost 9k method overhead, I do think that it's gotten to a place where splitting up the library into multiple parts would be beneficial. There are portions of the library I don't think my team would use at all, and being able to pick and choose what parts of the library we include in our app would help the android community to still use RxJava without feeling like there is a tradeoff.

In a perfect world, my team would probably only include Observable and Single to our applications. If Completable had less than 1000 methods overhead we'd consider it, but otherwise we'd just get by using Single as we have in the past. We haven't found a use for backpressure yet, so we wouldn't consider Flowable unless we discovered some really handy means of using it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ReactiveX/RxJava/issues/4564#issuecomment-251733692, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEEESpCt92zdSkT20Maf3NLujE-Xflhks5qw9bbgaJpZM4KAJuY .

ScottPierce commented 8 years ago

I'm not sure I agree with that. People don't want to start using ProGuard just to upgrade to RxJava 2.0. ProGuard is a pain. I think your experience here pretty much sums up my own.

Given that Android is a supported platform, and also that RxJava 2.0 uses up almost 1/7th the available methods, I think method count and app size should be a factor in decisions to split up the library. That's just my 2 cents. :)

JakeWharton commented 8 years ago

API symmetry and correctness trump method count concerns every time–especially for such a foundational library like this.

On Wed, Oct 5, 2016, 4:47 PM Scott Pierce notifications@github.com wrote:

I'm not sure I agree with that. People don't want to start using ProGuard just to upgrade to RxJava 2.0. ProGuard is a pain. I think your experience here pretty much sums up my own. https://twitter.com/jakewharton/status/664628805244448769

Given that Android is a supported platform, and also that RxJava 2.0 uses up almost 1/7th the available methods, I think method count and app size should be a factor in decisions to split up the library. That's just my 2 cents. :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ReactiveX/RxJava/issues/4564#issuecomment-251794816, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEEERE4n3nSOji4SGPnXbYpNwt-zfH7ks5qxAz2gaJpZM4KAJuY .

ScottPierce commented 8 years ago

I 100% agree. I don't necessarily think that API symmetry / correctness and solving method count issues on Android are mutually exclusive though.

I do think that perhaps I have interjected a separate topic into your proposal though. I'll watch for the results of this, and bring up method count as a separate issue.

digitalbuddha commented 8 years ago

I agree with Jake method count should not be factored in when deciding how to proceed. I believe it's reasonable to have those that care about method count to proguard. It's the same stance that Guava takes. I don't believe that a 3k method increase between versions means that everyone has to start using proguard at upgrade anymore than anyone was forced to use it in minor upgrades of 1.x. Besides the latest play services release added 5k methods (for all the firebase stuff), that imo made it more necessary for apps to proguard than this does.

akarnokd commented 8 years ago

This is definitely not happening for 2.x so I moved it to 3.x.

If we had extension methods in Java, splitting the library along base types would be possible but without it, the cross-dependencies (like Observable.singleOrError()) requires Single in its entirety (returning rich interfaces doesn't work as you'd need to implement all operators locally to avoid the dependency).

It is possible to split into 4 and dedicate Flowable to streaming only scenarios (i.e., Flowable.single returns Flowable) and Observable/Single/Maybe/Completable living together. The shared Disposable+Function+Scheduler components go into one and a Flowable-Observable interop module helps with the cross conversion.

akarnokd commented 7 years ago

I've implemented the remaining reactive types in RxJava 2 Extensions.

akarnokd commented 7 years ago

See https://github.com/akarnokd/RxJava3-preview for the split library.

akarnokd commented 5 years ago

I dediced to not split the library and not introduce backpressure-enabled 0..1 types in RxJava. Those who need such types can use Project Reactor's Mono type or the RxJava Extensions Project's types.

ReactiveX / RxJava

3.x: Split the library into two or adding types for symmetry #4564