langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.47k stars 7.24k forks source link

字符打出来不是一个字一个字的吐字,而是一片片的吐字 #7847

Closed eric0095 closed 2 months ago

eric0095 commented 2 months ago

Self Checks

Dify version

0.7.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

用了一段时间,知识库传了几百个文档了。最近才出现的问题。 只要输出文字,就是一片片的出,比如100个字,总是20个,30个字的出,卡顿感很强, 硬件用的腾讯云4C16G,7mb带宽,使用率很低。 1。应该不是流输入的问题,因为不用streaming模式,肯定是整个答案一起出了

  1. 不是llm的问题,测了deepseek和硅基流动的api,都是一样的。 3.不是网络问题,自己用本地服务器测了(同版本),始终是1个字一个字的吐,把上下行网络速度限制到最低,也是慢慢的一个字一个字的吐;
  2. 硬件问题,本身只跑一个web服务,可能性也不高。
  3. 知识库问题:这个可能性很大,因为是传了几百个文档之后出的问题。但是我不用知识库,直接llm回答的工作流也试了,也是卡顿半天,吐一片字,非常奇怪。
  4. 环境问题,用的标准的env.enviroment复制过来,只改了url和nginx端口

只能怀疑是软件问题,希望能帮助解决,很喜欢dify这个产品,想一直用下去

✔️ Expected Behavior

聊天时,机器人是一个字一个字的流式回复

❌ Actual Behavior

聊天时,机器人是卡顿半天,一片字一片字的回复

dosubot[bot] commented 2 months ago

Hello @eric0095! 👋 I'm here to help you with any bugs, questions, or contributions related to the repository. Let's work together to solve the issue you're experiencing.

I found two similar discussions that might be relevant to your issue:

  1. 0.6.14的输出模式怎么不是流式输出了,变成了一次性输出 (open)
  2. How to make the conversational application created using dify use streaming output? (open)

The issue you're experiencing with the chatbot outputting text in chunks instead of character by character is likely due to the design of the streaming mode in Dify. The streaming mode is designed to return text in chunked fashion rather than character by character. According to the documentation:

### ChunkChatCompletionResponse
Returns the stream chunks outputted by the App, `Content-Type` is `text/event-stream`.
Each streaming chunk starts with `data:`, separated by two newline characters `\n\n`, as shown below:
<CodeGroup>
```streaming {{ title: 'Response' }}
data: {"event": "message", "task_id": "900bbd43-dc0b-4383-a372-aa6e6c414227", "id": "663c5084-a254-4040-8ad3-51f2a3c1a77c", "answer": "Hi", "created_at": 1705398420}\n\n

The structure of the streaming chunks varies depending on the event:

This indicates that the design of the streaming mode is to return text in chunks rather than character by character [1].

Additionally, you mentioned that you are using the standard env.environment file. Here are some specific configurations and parameters in the env.environment file that could affect the streaming output behavior:

These configurations ensure that the server can handle longer streaming connections without timing out and can manage multiple connections efficiently [2][3][4][5].

For diagnosing performance issues, Dify provides specific logs to observe and annotate the performance of applications. The Logs Console records all interactions, including user input and AI output, which can help in identifying performance bottlenecks. Additionally, integrating with external LLMOps tools like LangSmith and Langfuse can offer comprehensive tracking and deep evaluation capabilities [6][7][8].

I hope this helps in resolving the issue you're facing.

To continue talking to Dosu, mention @dosu.

crazywoola commented 2 months ago

I confirm that I am using English to submit this report (我已阅读并同意 https://github.com/langgenius/dify/issues/1542). It's not a software issue, it's an Nginx configuration issue. Take a look at the documentation.