infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
10.08k stars 974 forks source link

[Bug]: ragflow can't deal with length reason abort #1207

Open CamusGao opened 1 week ago

CamusGao commented 1 week ago

Is there an existing issue for the same bug?

Branch name

main

Commit ID

83803a7

Other environment information

xinference as model provider.

Actual behavior

xinference's response was

HTTP/1.1 200 OK
date: Tue, 18 Jun 2024 10:39:03 GMT
server: uvicorn
cache-control: no-cache
connection: keep-alive
x-accel-buffering: no
content-type: text/event-stream; charset=utf-8
transfer-encoding: chunked

f0
data: {"id": "chatfb6a7232-2d5e-11ef-9558-7e465d7eb888", "model": "qwen1.5-chat", "created": 1718707144, "object": "chat.completion.chunk", "choices": [{"index": 0, "delta": {"role": "assistant", "content": ""}, "finish_reason": null}]}

df
data: {"id": "chatfb6a7232-2d5e-11ef-9558-7e465d7eb888", "model": "qwen1.5-chat", "created": 1718707144, "object": "chat.completion.chunk", "choices": [{"index": 0, "delta": {"content": ""}, "finish_reason": "length"}]}

10
data: [DONE]

0

and ragflow's response was

data:{"retcode": 0, "retmsg": "", "data": {"answer": "", "reference": {"total": 7, "chunks": [...some chunks]}}}

data:{"retcode": 0, "retmsg": "", "data": true}

ragflow gave no message and finished while model provider gave a abort response.

Expected behavior

Ragflow tells user that the content is too long

Steps to reproduce

see Actual behavior

Additional information

No response

shichunshan commented 1 week ago

Version 0.0.7 is good, it seems that there is an issue with the logic of max_tokens.

CamusGao commented 1 week ago

Version 0.0.7 is good, it seems that there is an issue with the logic of max_tokens.

I disabled max_tokens option in dialog settings.

But I think max_tokens should be used to check before actually interacting with the model, and should not replace the judgment of different responses from the model service.

KevinHuSh commented 1 week ago

This is why RAGFlow says too long since xinference tells it so. {"index": 0, "delta": {"content": ""}, "finish_reason": "length"}

CamusGao commented 1 week ago

This is why RAGFlow says too long since xinference tells it so. {"index": 0, "delta": {"content": ""}, "finish_reason": "length"}

ragflow doesn't say it to the front end. ragflow stops to response with no content and no reason. RAGFlow says nothing