jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack
https://docs.jina.ai
Apache License 2.0
20.87k stars 2.22k forks source link

Streaming Endpoint Issues with Gateways #6020

Closed chenxiaolong-coder closed 1 year ago

chenxiaolong-coder commented 1 year ago

Describe the bug

I found that the gateway caches the data to a certain size before sending it to the client, which can cause the typewriter effect to fail Describe how you solve it

set param include_gateway to False like this:

dep = Deployment(uses=XunFeiLLM, timeout_ready=-1, protocol='http', port=80, cors=True,
                 reload=True, timeout_send=-1, include_gateway=False)

Environment

企业微信截图_16912955319141

the parameter include_gateway set to True is as follows

企业微信截图_1691295660590

JoanFM commented 1 year ago

Hello,

Can you explain what the typewriter effect is?

In this case, the gateway is just a load balancer before all the replicas of the Deployment, so no caching or logic is applied there.

What is the expected behavior here?

JoanFM commented 1 year ago

If you do not need this because you do not want the Replication, you can disable the load balancer by having the include_gateway=False

chenxiaolong-coder commented 1 year ago

Taking a look at the two diagrams above, the expectation is that the first diagram will have the effect of presenting the data to the client in real time as a stream, rather than caching it to a certain size and then sending it to the client

JoanFM commented 1 year ago

Okey, will take a look once I am back from holiday.

Part of the team is out and we will look at it ASAP.

For now, if you do not need the replication, you can disable the gateway by passing the argument to False.

JoanFM commented 1 year ago

Hey @chenxiaolong-coder ,

this has been solved in the latest patch release.

https://github.com/jina-ai/jina/releases/tag/v3.20.1

Please, you can give it a try