labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
https://tryfastgpt.ai
Other
17.24k stars 4.61k forks source link

pdf text extraction error #621

Open dq7532183 opened 9 months ago

dq7532183 commented 9 months ago

例行检查

你的版本

问题描述 pdf文件文字提取错误 班级管理中小学生良好行为习惯的培养策略_陈芳.pdf 如何培养低年级学生课堂注意力_李淑英.pdf

相关截图 image

c121914yu commented 9 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Routine inspection

your version

Problem Description PDF file text extraction error Class management strategies for cultivating good behavioral habits of primary and secondary school students_Chen Fang.pdf How to cultivate the attention of lower grade students in class_Li Shuying.pdf

Related screenshots image

c121914yu commented 9 months ago

目前无法提取双列文本

dq7532183 commented 9 months ago

image

大部分双列文本的pdf就能提取成功,比如这个. 有一些就不行,不知道啥原因

4~6年级小学生冲突解决策略的发展特点及相关影响因素_李伟.pdf

c121914yu commented 9 months ago

先记录下,后面看看