EventStore / EventStoreDB-Client-Java

Official Asynchronous Java 8+ Client Library for EventStoreDB 20.6+
https://eventstore.com
Apache License 2.0
63 stars 20 forks source link

Implementation of reactive streaming API for read operations. #106

Closed dpasek-senacor closed 2 years ago

dpasek-senacor commented 2 years ago

This would a suggestion on implementing the streaming of read operations based on the reactive streams API supporting full backpressure and out-of-the-box compatibility with reative frameworks like RxJava, Project Reactor and JDK 9 Flow.

This solution consists of the following parts:

The major building block is definetly the ReadSubscription class which acts as the adapter between the gRPC StreamObserver-API and the Publisher / Subscriber contract. This implementation is currently based on thread synchronisation between the gRPC reader thread used for triggering the StreamObserver and the Subscribers consumer thread which can be any thread including the main thread.

oskardudycz commented 2 years ago

I reviewed @dpasek-senacor PR. It's excellent work. Thank you for the enhanced comments guiding about the changes.

I'll start with a disclaimer: I'm haven't worked on a Java project for a few years, so my perspective may be skewed. I have a general concern about the PR. I'm all for enabling the reactive approach and giving the option to enumerate through the long streams. I know that the RX approach is popular in the Java land, but I'm not sure how much that's a standard. I wonder if we could provide an API or a bit of syntactic sugar, making the default scenario so reading the whole stream more accessible? Based on my experience, most people want to read the entire stream and do left-fold to aggregate the state. I know that we could use EventCollectorReadSubscriber, but the user always needs to add a few additional lines. I'm not sure how much that's doable, but maybe we could come up with the API that's not an RX-first and enable accessibility on both scenarios?

Thoughts?

dpasek-senacor commented 2 years ago

Hi @oskardudycz

Thanks for the review and your thoughts about the approach.

I'm all for enabling the reactive approach and giving the option to enumerate through the long streams. I know that the RX approach is popular in the Java land, but I'm not sure how much that's a standard.

The approach using the reactive stream API should be the most standard way currently possible. It is not an RX approach, i.e. not tied to any special implementation of reactive streams but uses the API that allows interoperability with the market standards: RxJava and Project Reactor. It is also API compatible to the new JDK 9 Flow API which is the new official standard for consuming streams with backpressure. As discussed with @YoEight internally we would propose not to use Flow-API directly as this would break JDK8 support. This would be a bit to early as many projects are still tied to JDK 8. In 2 years it would be easy to migrate to JDK9 Flow-API by just replacing imports without changing any code. Last but not least this API is the closest to the API used in other languages, e.g. .net (using IAsyncEnumerable<ResolvedEvent> as return type) or JavaScript (using StreamingRead<AllStreamResolvedEvent>)

I wonder if we could provide an API or a bit of syntactic sugar, making the default scenario so reading the whole stream more accessible? Based on my experience, most people want to read the entire stream and do left-fold to aggregate the state. I know that we could use EventCollectorReadSubscriber, but the user always needs to add a few additional lines. I'm not sure how much that's doable, but maybe we could come up with the API that's not an RX-first and enable accessibility on both scenarios?

I think that the given approach provides us the flexibility and feature set and give us the options to make the live easier for certain use cases:

So yes we should consider adding methods which provide this "read into list" support as currently available, but use the reactive foundation behind it. This would also be beneficial in regard of keeping the next release of the driver backward compatible to the 1.0 release. I'm a huge fan of semantic versioning and following this approach the current change would require to release it as "breaking change", i.e. a major version. Keeping the old method signatures and adding the new ones for streaming would prevent this.

oskardudycz commented 2 years ago

@dpasek-senacor, I agree that for the long streams proposed, API is a great way to deal with them (especially regarding efficient memory usage and backpressure you mentioned).

My main take was that typically, you should try to avoid such cases, where you have really long streams. Of course, sometimes it's unavoidable, so it should be possible to do that effectively. We should try to provide the change non-breaking, so adding streams methods keeps the old one. After that, we could try to gather feedback from the community if they're okay with removing the support for the old ones.

dpasek-senacor commented 2 years ago

@YoEight @oskardudycz I have now updated the access to streams via reactive streams to be an additional option in the API next to the existing classic read methods. Internally it uses the reactive infrastracture. I have also reverted the samples to reference the previous implementation.