pokt-network / poktroll

The official Shannon upgrade implementation of the Pocket Network Protocol implemented using the Cosmos SDK
MIT License
15 stars 8 forks source link

[Off-chain] Implement `merge` observable operator #280

Open h5law opened 8 months ago

h5law commented 8 months ago

Objective

Implement the ability for two observables to be merged into a single observable that can supply events from two+ sources to a single observer.

Origin Document

239 realised the need for the ability to subscribe to multiple sources as a necessary refactor for an overly verbose DelegationClient subscription strategy

    // TODO_HACK(@h5law): Instead of listening to all events and doing a verbose
    // filter, we should subscribe to both MsgDelegateToGateway and MsgUndelegateFromGateway
    // messages directly, and filter those for the EventRedelegation event types.
    // This would save the delegation client from listening to a lot of unnecessary
    // events, that it filters out.
    // NB: This is not currently possible because the observer pattern does not
    // support multiplexing multiple observables into a single observable, that
    // can supply the EventsReplayClient with both the MsgDelegateToGateway and
    // MsgUndelegateFromGateway events.

Goals

Deliverables

Non-goals / Non-deliverables

General deliverables


Creator: @h5law Co-Owners: @bryanchriswhite

bryanchriswhite commented 8 months ago

"Merge" is what this operator is usually called. See the reactivex docs about it.

image

Olshansk commented 8 months ago

@bryanchriswhite When we started the observable package, we discussed how it resembles a pubsub library with fan-out and fan-in mechanisms, but we needed something specific.

The growing complexity of the observable package is making me think that it's looking more and more like a pubsub (i.e. multiple sources publish to a shared channel and you subscribe on specific events).

I wanted to ask you if you're confident that we're still on the right path implementing this ourselves rather than swapping it out for another solution (e.g. nats).

h5law commented 8 months ago

@Olshansk Id say its more like we are subscribing to differernt events and publishing them to a single notifee. I know this makes little difference but I'm in favour of doing a little research spike before implementing this with @bryanchriswhite as to whether we have outgrown our original ideas for this repo.

bryanchriswhite commented 8 months ago

@bryanchriswhite When we started the observable package, we discussed how it resembles a pubsub library with fan-out and fan-in mechanisms, but we needed something specific.

The growing complexity of the observable package is making me think that it's looking more and more like a pubsub (i.e. multiple sources publish to a shared channel and you subscribe on specific events).

@Olshansk I firmly disagree with this characterization. I think the similarity you're describing is surface-level only. There is no subscribing to specific events; rather, a consumer subscribes to an entire observable for the purpose of handling asynchronous data streams. Observables are not intrinsically related to message-oriented-middleware, which is what I believe you're describing (and NATS is an example of).

I wanted to ask you if you're confident that we're still on the right path implementing this ourselves rather than swapping it out for another solution (e.g. nats).

I am still very confident that this is the optimal path with respect to readability, maintainability, and extensibility. While observables are fairly abstract, their power simplifies the implementation of other high-level objects; thus encapsulating the complexity of handling asynchronous data streams within observables and operators. Additionally, the observer pattern is well understood and wide-spread across many programming language ecosystems (reactivex as one example specification with many language implementations). Nothing we build in this library should surprise anyone who is familiar with these concepts.

I asked chatGPT for help in articulating the differences between observables and pubsub and these are the bits that seemed especially relevant to this discussion:

Conceptual Basis:

  • Observable Implementation: This is typically a fundamental part of reactive programming. Observables represent a stream of data or events to which observers can subscribe. The observable emits notifications of changes, and subscribers react to these changes. It’s a pattern that focuses on asynchronous data streams and propagation of change.
  • Pub-Sub Libraries (NATS, Gossipsub): These are messaging systems designed for distributed systems communication. They follow the publish-subscribe pattern, where publishers send messages without knowledge of subscribers, and subscribers receive messages without knowledge of publishers. This pattern is often used in event-driven architectures and for decoupling services in a microservices architecture.

Use Cases:

  • Observable Implementation: Ideal for handling real-time updates, UI events, data binding, and other scenarios where reacting to data/event streams is required.
  • Pub-Sub Libraries: Suited for building distributed systems, microservices architectures, and event-driven systems where decoupling of components and scalable communication is essential.

Scenarios Where Observables and Pub-Sub Libraries Might be Mutually Exclusive:

  1. Internal Application State vs. Inter-Service Communication:
  • Observables: Best for managing internal application state, real-time updates within a single application, or handling UI events.
  • Pub-Sub Libraries: More suited for communication between different services or components in a distributed system.
  1. Single-Process Applications:
  • In a scenario where the entire application logic resides within a single process, using a pub-sub library for internal event handling might be overkill. Observables alone would suffice for handling asynchronous events and data streams.
bryanchriswhite commented 8 months ago

I did another round of searching and these are the only options I could find (which seemed they could potentially be production-grade):

I discovered this blog post discussing the differences between using an observer pattern vs. vanilla channel in go. The conclusion summarizes it nicely:

Both the classic Observer Pattern and Golang channels provide efficient ways to implement the Observer Pattern. The classic Observer Pattern using interfaces provides more flexibility and a clear separation between the subject and its observers, while using channels provides a lightweight and concurrent way to notify observers of changes in the subject. Ultimately, the choice between the two approaches depends on the specific requirements of the application, and the trade-offs between flexibility, ease of implementation, and concurrency.

In our case we get the benefit of both as we're implementing observer pattern interfaces using data structures that lightly wrap go channels.

Olshansk commented 8 months ago

@bryanchriswhite I really appreciate you continuing to push and providing all of the following:

My personal summary/takeaway is:

If anything, this gives me more confidence that (eventually) breaking out the Observer package into a separate library will be a real public good.