lablup / callosum

An RPC Transport Library for asyncio
MIT License
19 stars 2 forks source link

Refactor streaming transports and channels #7

Open achimnol opened 4 years ago

achimnol commented 4 years ago

Now channels and transports are coupled together.

Currently we have the following modules:

One of the problem is that Redis' STREAM APIs are too complicated for our use cases. We just need two communication patterns as in Backend.AI v19.09's event bus: fan-out broadcast & shared pipeline queue (without grouping and acks, and topic-based subscription routing can be simply done in Python instead of Redis). Let's make the detailed grouped message streaming as a future work. Also, the current proposal does not take "ack" into account, but later we could do it with the Redis STREAM API.

This should be refactored into:

And the stream module should be compatible with:

We also need to let "lower" modules to provide a factory function that returns recommended combination of transports for different types of channels.

achimnol commented 4 years ago

@temirrr If you want escape from your OS assignments, you may help me here. :wink:

temirrr commented 4 years ago

@achimnol ok, I see. I am currently quite busy, but I will look into it. If I start working on it, is there any particular task you would want me to help with or is it up to me?

achimnol commented 4 years ago

@temirrr Since the event bus in Backend.AI v19.09 is implemented as its own (without Callosum) due to the product release schedule, this part has no strict deadline for now. Just feel free to have your time and you may choose which part to work on first.

achimnol commented 4 years ago

The key design of the new event bus in Backend.AI v19.09 was to have both the fan-out broadcasting pattern and the shared pipeline for a single event. For instance, since user-facing event streaming APIs may be processed by arbitrary set of the manager instances (nodes and processes) as we deploy and horizontally scale the manager instances for HA, the container lifecycle events must be broadcasted to all manager instances (fan-out broadcast). At the same time, a subset of the container lifecycle events that trigger the database updates must be delivered to and handled by only one arbitrary manager instance (shared pipeline queue).

In the previous implementation of Redis-based communication channels in Callosum, we considered only the latter case, and the design got complicated because we was going to share the same transports with RPC and sticked to the Redis' STREAM API. Let's simplify the design by explicitly separating the connection/connector/binder/transports for specific communication patterns (aka channels in Callosum).

achimnol commented 4 years ago

Another technical reason to implement the Backend.AI's event bus without Callosum is that its communication pattern is not bipartitioned. Both the managers and agents generate events, while those events are only processed by managers. Using Redis, it could be possible to perform non-bipartitioned communication with Callosum, but I had not enough time to test and polish the Redis lower transports in Callosum.