Open axsaucedo opened 3 years ago
I think both options can probably be combined.
From my point of view, asyncio
support is a must. Even without streaming, it's probably needed to optimise pipelines (where I imagine most of the time would be spent in IO, waiting for the model's answer). For the streaming scenario, asyncio
can be leveraged to optimise the waiting time for Kafka to come back.
We could think about making it optional though. That is, let people define both async
or "classic" methods, and let Tempo decide how to call a custom function (i.e. with await
or synchronously). This is similar to how FastAPI handles this, without forcing the user to write full async
code.
Currently tempo has clients that talk to remote models. In order to introduce support for stream processing such as through Kafka, we would need to extend the current interfaces to support asynchronous processing.
Models would be defined the same way
Option 1: AsyncIO
Each instance to a pipeline is processed synchronously, with each subsequent model "synchronously", of course using AsyncIO to release control, but waiting until the respective model "returns" the output to the output_topic.
Advantages
Disadvantages
Option 2: Fully Asynchronous Streams
The second option is to approach it in the fully traditional independent stream processing approach, where each independent data instance would be processes as single step across the streaming pipeline process.
Advantages
Disadvantage