chatchat-space / Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Apache License 2.0
31.96k stars 5.57k forks source link

[BUG] 添加文件到知识库未知原因导致api服务down掉 #3807

Closed fivegg closed 6 months ago

fivegg commented 6 months ago

上传了一个std_1457556.docx,提示如下: 2024-04-19 09:20:30,375 - main.py[line:158] - WARNING: Because the aspect ratio of the current image exceeds the limit (min_height or width_height_ratio), the program will skip the detection step. RapidOCRDocLoader block index: 3193: 80%|███████████████████████ | 3193/4006 [01:03<04:28, 3.03it/s]2024-04-19 09:21:02,261 - main.py[line:158] - WARNING: Because the aspect ratio of the current image exceeds the limit (min_height or width_height_ratio), the program will skip the detection step. RapidOCRDocLoader block index: 3199: 80%|███████████████████████▏ | 3199/4006 [01:20<04:37, 2.90it/s]2024-04-19 09:21:21,482 - utils.py[line:95] - ERROR: RemoteProtocolError: error when post /knowledge_base/update_docs: Server disconnected without sending a response. 2024-04-19 09:21:21,702 - utils.py[line:95] - ERROR: ConnectError: error when post /knowledge_base/update_docs: [Errno 111] Connection refused。 排查发现是api服务down了,导致访问/knowledge_base/update_docs接口出错。但是log看不到任何相关信息。

dosubot[bot] commented 6 months ago

The issue you're encountering with the API service going down during the /knowledge_base/update_docs call could stem from several potential causes. Here's a more focused analysis based on the symptoms and logs you've shared:

  1. Service Disconnection and Connection Refusal: The errors RemoteProtocolError: Server disconnected without sending a response and ConnectError: [Errno 111] Connection refused suggest an abrupt termination of the service or a failure in maintaining a stable connection. This could be due to underlying infrastructure issues, such as network instability, resource constraints (e.g., memory or disk space limitations), or an unhandled exception within the service causing it to crash.

  2. File Processing and Aspect Ratio Warnings: The warnings about the aspect ratio exceeding limits indicate that the document processing is encountering files it's not configured to handle optimally. While these warnings themselves don't directly cause service downtime, they highlight potential inefficiencies or misconfigurations in how documents are processed, which could contribute to system stress.

  3. Potential Causes Not Directly Evident in Logs: The absence of direct error messages related to the service shutdown in the logs suggests that the root cause might not be straightforward. It could be related to uncaught exceptions during document processing, issues with the document's content causing unexpected behavior, or even security mechanisms triggering under certain conditions.

To address this issue, consider the following steps:

Without more specific details on the RapidOCRDocLoader and its role in the process, it's challenging to provide a more targeted diagnosis. If you can share additional context or logs related to this component, it might shed further light on the issue.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
fivegg commented 6 months ago

不是。我单独运行server.knowledge_base.utils,直接调用kb_file.file2docs。终端运行到3199/4006时候显示killed。 2024-04-19 12:40:11,607 - main.py[line:158] - WARNING: Because the aspect ratio of the current image exceeds the limit (min_height or width_height_ratio), the program will skip the detection step. RapidOCRDocLoader block index: 3199: 80%|███████████████████████▏ | 3199/4006 [01:20<04:54, 2.74it/s]Killed