Closed fivegg closed 6 months ago
The issue you're encountering with the API service going down during the /knowledge_base/update_docs
call could stem from several potential causes. Here's a more focused analysis based on the symptoms and logs you've shared:
Service Disconnection and Connection Refusal: The errors RemoteProtocolError: Server disconnected without sending a response
and ConnectError: [Errno 111] Connection refused
suggest an abrupt termination of the service or a failure in maintaining a stable connection. This could be due to underlying infrastructure issues, such as network instability, resource constraints (e.g., memory or disk space limitations), or an unhandled exception within the service causing it to crash.
File Processing and Aspect Ratio Warnings: The warnings about the aspect ratio exceeding limits indicate that the document processing is encountering files it's not configured to handle optimally. While these warnings themselves don't directly cause service downtime, they highlight potential inefficiencies or misconfigurations in how documents are processed, which could contribute to system stress.
Potential Causes Not Directly Evident in Logs: The absence of direct error messages related to the service shutdown in the logs suggests that the root cause might not be straightforward. It could be related to uncaught exceptions during document processing, issues with the document's content causing unexpected behavior, or even security mechanisms triggering under certain conditions.
To address this issue, consider the following steps:
Without more specific details on the RapidOCRDocLoader
and its role in the process, it's challenging to provide a more targeted diagnosis. If you can share additional context or logs related to this component, it might shed further light on the issue.
不是。我单独运行server.knowledge_base.utils,直接调用kb_file.file2docs。终端运行到3199/4006时候显示killed。 2024-04-19 12:40:11,607 - main.py[line:158] - WARNING: Because the aspect ratio of the current image exceeds the limit (min_height or width_height_ratio), the program will skip the detection step. RapidOCRDocLoader block index: 3199: 80%|███████████████████████▏ | 3199/4006 [01:20<04:54, 2.74it/s]Killed
上传了一个std_1457556.docx,提示如下: 2024-04-19 09:20:30,375 - main.py[line:158] - WARNING: Because the aspect ratio of the current image exceeds the limit (min_height or width_height_ratio), the program will skip the detection step. RapidOCRDocLoader block index: 3193: 80%|███████████████████████ | 3193/4006 [01:03<04:28, 3.03it/s]2024-04-19 09:21:02,261 - main.py[line:158] - WARNING: Because the aspect ratio of the current image exceeds the limit (min_height or width_height_ratio), the program will skip the detection step. RapidOCRDocLoader block index: 3199: 80%|███████████████████████▏ | 3199/4006 [01:20<04:37, 2.90it/s]2024-04-19 09:21:21,482 - utils.py[line:95] - ERROR: RemoteProtocolError: error when post /knowledge_base/update_docs: Server disconnected without sending a response. 2024-04-19 09:21:21,702 - utils.py[line:95] - ERROR: ConnectError: error when post /knowledge_base/update_docs: [Errno 111] Connection refused。 排查发现是api服务down了,导致访问/knowledge_base/update_docs接口出错。但是log看不到任何相关信息。