apache / apisix

The Cloud-Native API Gateway
https://apisix.apache.org/blog/
Apache License 2.0
14.44k stars 2.51k forks source link

bug: The HTTP stream issue in a reverse proxy scenario #9804

Open Yullin opened 1 year ago

Yullin commented 1 year ago

Current Behavior

Scenario: The company is working on a large model, similar to a chatbot-type service like ChatGPT. The development team has set up a service and is accessing it using the following code:

prompt = "xxxxxxxxx"
url = "http://10.0.0.3:5001/api/stream/chat"
data = {
    "prompt": prompt,
    "max_new_tokens": 1024,  
    "max_context_tokens": 5120,
    "temperature": 0,  
    "history": [],  
    "new_session": False,
}

chunks = requests.post(url=url, json=data, stream=True)
answer = ""
for chunk in chunks.iter_content(chunk_size=None):
    if chunk:
        try:
            chunk = json.loads(chunk.decode('utf-8'))
            print(chunk)
        except Exception as e:
            print("----------------error---------------")
            print(chunk)
            print(e)
            print("----------------error---------------")
            continue

Problem: In normal circumstances (accessing the development service directly), the JSON data is printed out one by one. But after going through OpenResty reverse proxy, the first few JSON data is still printed out one by one, and the last two JSON data are returned together as a single string ("{"text":"...."}{"text":"...."}"), causing json.loads to fail.

Configuration:

 "value": {
                    "host": "t.example.com",
                    "methods": [
                        "GET",
                        "POST",
                        "PUT",
                        "DELETE",
                        "HEAD",
                        "OPTIONS",
                        "PATCH"
                    ],
                    "plugins": {
                    },
                    "plugins_check": "other",
                    "priority": 1,
                    "status": 1,
                    "timeout": {
                        "connect": 60,
                        "read": 60,
                        "send": 60
                    },
                    "upstream": {
                        "hash_on": "vars",
                        "name": "nodes",
                        "nodes": {
                            "10.0.0.3:5001": 1
                        },
                        "pass_host": "pass",
                        "scheme": "http",
                        "type": "roundrobin"
                    },
                    "uri": "/*"
                }

I have tested with HAProxy and there are no such issues.

Expected Behavior

the JSON data is printed out one by one. and I have posted this issue to openresty(https://github.com/openresty/openresty/issues/914) , maybe people here can fix this

Error Logs

No response

Steps to Reproduce

  1. Set up a large model as a service, like CHATGPT, or other Server Side Event-stream service
  2. Run apisix via the docker image
  3. Create a route with the admin api, set the large model service as a backend
  4. Access with the following code:
    
    prompt = "xxxxxxxxx"
    url = "http://{apisix_ip_addr}/api/stream/chat"
    data = {
    "prompt": prompt,
    "max_new_tokens": 1024,  
    "max_context_tokens": 5120,
    "temperature": 0,  
    "history": [],  
    "new_session": False,
    }

chunks = requests.post(url=url, json=data, stream=True) answer = "" for chunk in chunks.iter_content(chunk_size=None): if chunk: try: chunk = json.loads(chunk.decode('utf-8')) print(chunk) except Exception as e: print("----------------error---------------") print(chunk) print(e) print("----------------error---------------") continue



### Environment

- APISIX version (run `apisix version`): 2.15.3
- Operating system (run `uname -a`):
- OpenResty / Nginx version (run `openresty -V` or `nginx -V`): 
- etcd version, if relevant (run `curl http://127.0.0.1:9090/v1/server_info`):
- APISIX Dashboard version, if relevant:
- Plugin runner version, for issues related to plugin runners:
- LuaRocks version, for installation issues (run `luarocks --version`):
Yullin commented 1 year ago

Proposal: Support following openresty config In plugin proxy-control:

        proxy_buffering off;
        proxy_http_version 1.1;
        proxy_set_header Connection "";

with those config to support http stream, in chatbot scenario

Yullin commented 1 year ago

Sorry,I don't know any other easier way. But I've fixed this problem with following action: after the backend service output the last chunk, sleep 5ms, and then close the connection. So, now, just need to support the proposal I mentioned

qwzhou89 commented 1 year ago

proxy_buffering off; proxy_http_version 1.1; proxy_set_header Connection "";

Is there any way to dynamically change these settings on a particular route?

Revolyssup commented 1 year ago

@qwzhou89 Currently this feature is not supported in the open source version.

kayx23 commented 10 months ago

enterprise: https://docs.api7.ai/hub/proxy-buffering

cdmikechen commented 10 months ago

enterprise: https://docs.api7.ai/hub/proxy-buffering

When will the open source apisix be adapted to the SSE ?

kayx23 commented 10 months ago

At the moment, no plan that I'm aware of.

xshadowlegendx commented 2 months ago

you can return x-accel-buffering: no in response header from your upstream in order to disable proxy_buffering option on nginx, it is from the documentation