infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
23.08k stars 2.26k forks source link

[Bug]: several bugs on RAGFlow #1017

Open liutaocode opened 5 months ago

liutaocode commented 5 months ago

Is there an existing issue for the same feature request?

Is your feature request related to a problem?

No

Describe the feature you'd like

1、解析 pdf 巨慢,我们用了 gpu 3090卡之后,提速了,但是一篇论文还是需要1-2分钟时间才能解析好,效果还有优化空间; 2、单纯向量搜索,一些关键词无法召回,但是我确定论文是有的,而且打开对应文章也chunk也解析好了,但是就是搜不出来(这个是最严重的,需要提升召回率); 3、基于 RAG 结果进行 LLM,很多问题无法回答,比如你问 A 和 B 的区别,他说不知道,但是你单独问 A 或者问 B 是什么,他又可以解释出来,可能是需要对问题进行拆解

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

KevinHuSh commented 5 months ago

About the number 2. Could you submit a sample file and the keywords used to retieval?