jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack
https://docs.jina.ai
Apache License 2.0
20.63k stars 2.21k forks source link

Bi-directional Streaming #6060

Closed NarekA closed 2 months ago

NarekA commented 9 months ago

Describe the feature

As I understand it, the current API only supports response streaming. Is there a way to support bi-directional streaming? I imagine it would look like the current streaming API, except the input would be a generator. This would be useful for applications such as chat-bots.

Your proposal

# then define the Executor
class MyExecutor(Executor):

    @requests(on='/hello')
    async def task(self, docs: Generator[MyDocument], **kwargs) -> MyDocument:
        for doc in docs:
            yield MyDocument(text=f'{doc.text} output')
JoanFM commented 9 months ago

The Client already behaves as a bidirectional stream. I do not think there is much need for this feature.

NarekA commented 9 months ago

@JoanFM 2 questions:

  1. Is this true for the stream_doc method only or the post method too?
  2. In the example above, the server can save context. Example:

    # then define the Executor
    class MyExecutor(Executor):
    
    @requests(on='/hello')
    async def task(self, docs: Generator[MyDocument], **kwargs) -> MyDocument:
        conversation = ""
        for doc in docs:
            conversation += f"\n\n{doc.text}"
            yield MyDocument(text=conversation)

The benefit of this is not having to send the previous messages in a long conversation and not having to reload state. Is there a way to do that in the current API? Maybe using executor state?

NarekA commented 9 months ago

I'm reading through the code, and as I understand it, ~all endpoints~ can be streaming if GRPC or websocket is enabled. Is this correct? The docs imply that streaming endpoints need to have a single document as the input.

Edit: I think I was wrong about it being "all endpoints"

JoanFM commented 9 months ago

okey ur point is that u need to keep the context in the stack making sure it arrives at the same replica?

NarekA commented 9 months ago

Yes, similar to the example in the GRPC docs for bi-directional streaming:

def RouteChat(self, request_iterator, context):
    prev_notes = []
    for new_note in request_iterator:
        for prev_note in prev_notes:
            if prev_note.location == new_note.location:
                yield prev_note
        prev_notes.append(new_note)
JoanFM commented 9 months ago

I think with GRPC it may be possible, but I am not sure if in HTTP there is a way to get an stream as input.

JoanFM commented 9 months ago

otherwise to get this state IN, you may want to do a nice usage of Stateful Executor but this could feel overkill.

NarekA commented 9 months ago

We could have the endpoint be a normal endpoint (same as if the generator was a DocList) when the server is http and websocket.

Also, it would be useful to know how to stream bi-directionally when using the current configuration (IE, does it only work with streaming endpoints, or do normal endpoints keep the connection open too?)

JoanFM commented 9 months ago

I will look at it in more detail and come back

JoanFM commented 9 months ago

I have been thinking about this and seems interesting.

Here are my thoughts:

All of this is nice, but I need to confirm that in Kubernetes world, where Jina gateway job as load balancer is taken by LinkerD, the same behavior is achieved.

jina-bot commented 6 months ago

@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

jina-bot commented 2 months ago

@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days