Streaming Endpoint Issues with Gateways

chenxiaolong-coder commented 1 year ago

Describe the bug

I found that the gateway caches the data to a certain size before sending it to the client, which can cause the typewriter effect to fail Describe how you solve it

set param include_gateway to False like this:

dep = Deployment(uses=XunFeiLLM, timeout_ready=-1, protocol='http', port=80, cors=True,
                 reload=True, timeout_send=-1, include_gateway=False)

Environment

jina 3.20.0
docarray 0.36.0
jcloud 0.2.12
jina-hubble-sdk 0.39.0
jina-proto 0.1.27
protobuf 4.23.4
proto-backend upb
grpcio 1.47.5
pyyaml 6.0.1
python 3.10.0
platform Windows
platform-release 10
platform-version 10.0.22621
architecture AMD64
processor Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
uid 123870401841410
session-id 0e41f950-3500-11ee-b0dc-70a8d34cc902
uptime 2023-08-07T16:54:44.628100
ci-vendor (unset)
internal False
JINA_DEFAULT_HOST (unset)
JINA_DEFAULT_TIMEOUT_CTRL (unset)
JINA_DEPLOYMENT_NAME (unset)
JINA_DISABLE_UVLOOP (unset)
JINA_EARLY_STOP (unset)
JINA_FULL_CLI (unset)
JINA_GATEWAY_IMAGE (unset)
JINA_GRPC_RECV_BYTES (unset)
JINA_GRPC_SEND_BYTES (unset)
JINA_HUB_NO_IMAGE_REBUILD (unset)
JINA_LOG_CONFIG (unset)
JINA_LOG_LEVEL (unset)
JINA_LOG_NO_COLOR (unset)
JINA_MP_START_METHOD (unset)
JINA_OPTOUT_TELEMETRY (unset)
JINA_RANDOM_PORT_MAX (unset)
JINA_RANDOM_PORT_MIN (unset)
JINA_LOCKS_ROOT (unset)
JINA_K8S_ACCESS_MODES (unset)
JINA_K8S_STORAGE_CLASS_NAME (unset)
JINA_K8S_STORAGE_CAPACITY (unset)
JINA_STREAMER_ARGS (unset) Screenshots
A screenshot of a wireshark packet capture with the parameter include_gateway set to False is as follows:

企业微信截图_16912955319141

the parameter include_gateway set to True is as follows

企业微信截图_1691295660590

JoanFM commented 1 year ago

Hello,

Can you explain what the typewriter effect is?

In this case, the gateway is just a load balancer before all the replicas of the Deployment, so no caching or logic is applied there.

What is the expected behavior here?

JoanFM commented 1 year ago

If you do not need this because you do not want the Replication, you can disable the load balancer by having the include_gateway=False

chenxiaolong-coder commented 1 year ago

Taking a look at the two diagrams above, the expectation is that the first diagram will have the effect of presenting the data to the client in real time as a stream, rather than caching it to a certain size and then sending it to the client

JoanFM commented 1 year ago

Okey, will take a look once I am back from holiday.

Part of the team is out and we will look at it ASAP.

For now, if you do not need the replication, you can disable the gateway by passing the argument to False.

JoanFM commented 1 year ago

Hey @chenxiaolong-coder ,

this has been solved in the latest patch release.

https://github.com/jina-ai/jina/releases/tag/v3.20.1

Please, you can give it a try

jina-ai / jina

Streaming Endpoint Issues with Gateways #6020