labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
https://tryfastgpt.ai
Other
17.49k stars 4.69k forks source link

知识库批量导入文件时部分文件进度卡到70%-99%之间不动,且从mongo数据库中看到该文件实际上已上传 #2148

Closed nixiaodi closed 2 months ago

nixiaodi commented 2 months ago

例行检查

你的版本

问题描述, 日志截图 从知识库中导入pdf文件时,同时导入多个文件,例如50个文件,会存在部分文件导入进度条卡到70%-99%之间,且后续不再变动 复现步骤 1、进入知识库点击文本数据集 2、同时导入50个左右的pdf文件,每个文件约1-10M之间 预期结果 修复该问题 相关截图 63754f2b7d847b5f021e31e176ffb2d

xiaohuohu commented 2 months ago

顶一下,遇到了相同问题

ws02589111 commented 2 months ago

一样的问题,且后台服务一直在输出以下内容:

[Warn] 2024-07-25 04:26:44 Slow operation 8890ms {"query":null,"op":"save","duration":8890}
[Warn] 2024-07-25 04:26:44 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:44 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:44 Slow operation 8871ms {"query":null,"op":"save","duration":8871}
[Warn] 2024-07-25 04:26:44 Slow operation 8869ms {"query":null,"op":"save","duration":8869}
[Warn] 2024-07-25 04:26:44 Slow operation 8866ms {"query":null,"op":"save","duration":8866}
[Warn] 2024-07-25 04:26:44 Slow operation 8868ms {"query":null,"op":"save","duration":8868}
[Warn] 2024-07-25 04:26:44 Slow operation 8860ms {"query":null,"op":"save","duration":8860}
[Warn] 2024-07-25 04:26:44 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:44 Slow operation 8875ms {"query":null,"op":"save","duration":8875}
[Warn] 2024-07-25 04:26:44 Slow operation 8893ms {"query":null,"op":"save","duration":8893}
[Warn] 2024-07-25 04:26:45 Slow operation 8865ms {"query":null,"op":"save","duration":8865}
[Warn] 2024-07-25 04:26:45 Slow operation 8863ms {"query":null,"op":"save","duration":8863}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8886ms {"query":null,"op":"save","duration":8886}
[Warn] 2024-07-25 04:26:45 Slow operation 8887ms {"query":null,"op":"save","duration":8887}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8885ms {"query":null,"op":"save","duration":8885}
[Warn] 2024-07-25 04:26:45 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:45 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:45 Slow operation 8893ms {"query":null,"op":"save","duration":8893}
[Warn] 2024-07-25 04:26:45 Slow operation 8887ms {"query":null,"op":"save","duration":8887}
[Warn] 2024-07-25 04:26:45 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:45 Slow operation 8893ms {"query":null,"op":"save","duration":8893}

强制重启服务后,发现文件实际已经导入完成 image

c121914yu commented 2 months ago

一样的问题,且后台服务一直在输出以下内容:

[Warn] 2024-07-25 04:26:44 Slow operation 8890ms {"query":null,"op":"save","duration":8890}
[Warn] 2024-07-25 04:26:44 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:44 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:44 Slow operation 8871ms {"query":null,"op":"save","duration":8871}
[Warn] 2024-07-25 04:26:44 Slow operation 8869ms {"query":null,"op":"save","duration":8869}
[Warn] 2024-07-25 04:26:44 Slow operation 8866ms {"query":null,"op":"save","duration":8866}
[Warn] 2024-07-25 04:26:44 Slow operation 8868ms {"query":null,"op":"save","duration":8868}
[Warn] 2024-07-25 04:26:44 Slow operation 8860ms {"query":null,"op":"save","duration":8860}
[Warn] 2024-07-25 04:26:44 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:44 Slow operation 8875ms {"query":null,"op":"save","duration":8875}
[Warn] 2024-07-25 04:26:44 Slow operation 8893ms {"query":null,"op":"save","duration":8893}
[Warn] 2024-07-25 04:26:45 Slow operation 8865ms {"query":null,"op":"save","duration":8865}
[Warn] 2024-07-25 04:26:45 Slow operation 8863ms {"query":null,"op":"save","duration":8863}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8886ms {"query":null,"op":"save","duration":8886}
[Warn] 2024-07-25 04:26:45 Slow operation 8887ms {"query":null,"op":"save","duration":8887}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8885ms {"query":null,"op":"save","duration":8885}
[Warn] 2024-07-25 04:26:45 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:45 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:45 Slow operation 8893ms {"query":null,"op":"save","duration":8893}
[Warn] 2024-07-25 04:26:45 Slow operation 8887ms {"query":null,"op":"save","duration":8887}
[Warn] 2024-07-25 04:26:45 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:45 Slow operation 8893ms {"query":null,"op":"save","duration":8893}

强制重启服务后,发现文件实际已经导入完成 image

该升级数据库了

ws02589111 commented 2 months ago

一样的问题,且后台服务一直在输出以下内容:

[Warn] 2024-07-25 04:26:44 Slow operation 8890ms {"query":null,"op":"save","duration":8890}
[Warn] 2024-07-25 04:26:44 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:44 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:44 Slow operation 8871ms {"query":null,"op":"save","duration":8871}
[Warn] 2024-07-25 04:26:44 Slow operation 8869ms {"query":null,"op":"save","duration":8869}
[Warn] 2024-07-25 04:26:44 Slow operation 8866ms {"query":null,"op":"save","duration":8866}
[Warn] 2024-07-25 04:26:44 Slow operation 8868ms {"query":null,"op":"save","duration":8868}
[Warn] 2024-07-25 04:26:44 Slow operation 8860ms {"query":null,"op":"save","duration":8860}
[Warn] 2024-07-25 04:26:44 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:44 Slow operation 8875ms {"query":null,"op":"save","duration":8875}
[Warn] 2024-07-25 04:26:44 Slow operation 8893ms {"query":null,"op":"save","duration":8893}
[Warn] 2024-07-25 04:26:45 Slow operation 8865ms {"query":null,"op":"save","duration":8865}
[Warn] 2024-07-25 04:26:45 Slow operation 8863ms {"query":null,"op":"save","duration":8863}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8886ms {"query":null,"op":"save","duration":8886}
[Warn] 2024-07-25 04:26:45 Slow operation 8887ms {"query":null,"op":"save","duration":8887}
[Warn] 2024-07-25 04:26:45 Slow operation 8884ms {"query":null,"op":"save","duration":8884}
[Warn] 2024-07-25 04:26:45 Slow operation 8885ms {"query":null,"op":"save","duration":8885}
[Warn] 2024-07-25 04:26:45 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:45 Slow operation 8889ms {"query":null,"op":"save","duration":8889}
[Warn] 2024-07-25 04:26:45 Slow operation 8893ms {"query":null,"op":"save","duration":8893}
[Warn] 2024-07-25 04:26:45 Slow operation 8887ms {"query":null,"op":"save","duration":8887}
[Warn] 2024-07-25 04:26:45 Slow operation 8892ms {"query":null,"op":"save","duration":8892}
[Warn] 2024-07-25 04:26:45 Slow operation 8893ms {"query":null,"op":"save","duration":8893}

强制重启服务后,发现文件实际已经导入完成 image

该升级数据库了

是刚部署不久的,且里面只有截图中刚导入的数据

aopstudio commented 2 months ago

相同的问题

Essence9999 commented 2 months ago

相同的问题,Slow operation

duskcouple commented 2 months ago

同样出现该现象,以前版本没有问题,4.8.7出现的文件卡住的问题。排除了文件问题,再次上传每次都是随机卡住某些文件不动。

c121914yu commented 2 months ago
image

103 个文件非常轻松,还是2c4g的轻量测试机器,一共 200M,大概 1 分钟。当然, 他们都是并发的,如果你的部署环境加了超 IO 或者内存限制,估计会 g。

c121914yu commented 2 months ago

可以看看 90% 的文件,有没有进入数据库,前端是否拿到了对应的文件 ID。

c121914yu commented 2 months ago

可以用上面 PR 的临时镜像来测试下是否可以正常:ghcr.io/labring/fastgpt-pr:2191 (基于 4.8.8)

Essence9999 commented 2 months ago

我是批量上传word文档有这种问题,单个文档4m

duskcouple commented 2 months ago

我是批量上传word文档有这种问题,单个文档4m

对,批量上传一批docx文件,最大的80m,大概13个,pdf没试过。

Essence9999 commented 2 months ago

我是批量上传word文档有这种问题,单个文档4m

对,批量上传一批docx文件,最大的80m,大概13个,pdf没试过。

嗯嗯,pdf应该没啥问题,猜测pdf解析出来的文本内容size较小,docx中文本size较大; 你fastgpt会出现这种报错吗? [Error] 2024-07-26 08:37:15 Slow operation 977ms { message: { query: null, op: 'aggregate', duration: 977 }, stack: undefined } [Warn] 2024-07-26 08:37:16 Slow operation 2138ms {"query":null,"op":"aggregate","duration":2138} [Warn] 2024-07-26 08:37:18 Slow operation 4102ms {"query":null,"op":"aggregate","duration":4102} [Warn] 2024-07-26 08:37:20 Slow operation 5370ms {"query":null,"op":"aggregate","duration":5370} [Warn] 2024-07-26 08:37:21 Slow operation 6834ms {"query":null,"op":"aggregate","duration":6834} [Warn] 2024-07-26 08:37:21 Slow operation 7171ms {"query":null,"op":"aggregate","duration":7171} [Warn] 2024-07-26 08:37:24 Slow operation 9711ms {"query":null,"op":"aggregate","duration":9711} [Warn] 2024-07-26 08:37:24 Slow operation 9719ms {"query":null,"op":"aggregate","duration":9719}

duskcouple commented 2 months ago

我是批量上传word文档有这种问题,单个文档4m

对,批量上传一批docx文件,最大的80m,大概13个,pdf没试过。

嗯嗯,pdf应该没啥问题,猜测pdf解析出来的文本内容size较小,docx中文本size较大; 你fastgpt会出现这种报错吗? [Error] 2024-07-26 08:37:15 Slow operation 977ms { message: { query: null, op: 'aggregate', duration: 977 }, stack: undefined } [Warn] 2024-07-26 08:37:16 Slow operation 2138ms {"query":null,"op":"aggregate","duration":2138} [Warn] 2024-07-26 08:37:18 Slow operation 4102ms {"query":null,"op":"aggregate","duration":4102} [Warn] 2024-07-26 08:37:20 Slow operation 5370ms {"query":null,"op":"aggregate","duration":5370} [Warn] 2024-07-26 08:37:21 Slow operation 6834ms {"query":null,"op":"aggregate","duration":6834} [Warn] 2024-07-26 08:37:21 Slow operation 7171ms {"query":null,"op":"aggregate","duration":7171} [Warn] 2024-07-26 08:37:24 Slow operation 9711ms {"query":null,"op":"aggregate","duration":9711} [Warn] 2024-07-26 08:37:24 Slow operation 9719ms {"query":null,"op":"aggregate","duration":9719}

我还没来得及收集日志,上午开会的时候,业务部门用户打电话反馈说批量上传卡住,没法操作下一步了,我们用的是商业版,以前版本没有用户反馈问题,前几天升级了最新的后才出现。晚点的时候我拿到用户的上传文件,自己测试下看下日志。我让用户用排除法测试了下,把进度卡住的文件去掉,再次上传,也还是会出现问题,排除是文件本身的问题。

ws02589111 commented 2 months ago

我是上传一个400KB+的markdown会出现,概率非常大,删除对应文件后,再次上传,浏览器控制台报错,得刷新前端才行: image

image

Essence9999 commented 2 months ago

我是批量上传word文档有这种问题,单个文档4m

对,批量上传一批docx文件,最大的80m,大概13个,pdf没试过。

嗯嗯,pdf应该没啥问题,猜测pdf解析出来的文本内容size较小,docx中文本size较大; 你fastgpt会出现这种报错吗? [Error] 2024-07-26 08:37:15 Slow operation 977ms { message: { query: null, op: 'aggregate', duration: 977 }, stack: undefined } [Warn] 2024-07-26 08:37:16 Slow operation 2138ms {"query":null,"op":"aggregate","duration":2138} [Warn] 2024-07-26 08:37:18 Slow operation 4102ms {"query":null,"op":"aggregate","duration":4102} [Warn] 2024-07-26 08:37:20 Slow operation 5370ms {"query":null,"op":"aggregate","duration":5370} [Warn] 2024-07-26 08:37:21 Slow operation 6834ms {"query":null,"op":"aggregate","duration":6834} [Warn] 2024-07-26 08:37:21 Slow operation 7171ms {"query":null,"op":"aggregate","duration":7171} [Warn] 2024-07-26 08:37:24 Slow operation 9711ms {"query":null,"op":"aggregate","duration":9711} [Warn] 2024-07-26 08:37:24 Slow operation 9719ms {"query":null,"op":"aggregate","duration":9719}

我还没来得及收集日志,上午开会的时候,业务部门用户打电话反馈说批量上传卡住,没法操作下一步了,我们用的是商业版,以前版本没有用户反馈问题,前几天升级了最新的后才出现。晚点的时候我拿到用户的上传文件,自己测试下看下日志。我让用户用排除法测试了下,把进度卡住的文件去掉,再次上传,也还是会出现问题,排除是文件本身的问题。

嗯嗯,是的,最近在录知识库时,有这种情况,因为大文件,会卡住,后台cpu利用率增大;需要等待一会,后台会创建索引并入库;但前台不会显示。

Essence9999 commented 2 months ago

我是上传一个400KB+的markdown会出现,概率非常大,删除对应文件后,再次上传,浏览器控制台报错,得刷新前端才行: image

image

markdown到没遇见过;MD格式不能直接拖拽,会报错;点击上传应该没问题,尤其你这才几百k,应该是可以的。一般大几M的word比较常见

duskcouple commented 2 months ago

刚测试了下,批量传10个左右的docx文件,只要里面有一个文件大于10m以上,故障率复现率100%。日志里没有报错。而且比较诡异的是如果上传的文件里有多个大于10m的文件,卡住的那个不一定是那个最大的,可能是倒数第二大的,最大的那个反而是进度100%了。如果批量上传10m以下的小文件,没问题,试了几次都没有出现故障现象。我们是企业用的,GPU服务器本身的配置并不低,不是个人玩的那种虚拟主机。 2 3 4 1