issues
search
opendatalab
/
PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
https://pdf-extract-kit.readthedocs.io/zh-cn/latest/index.html
GNU Affero General Public License v3.0
5.27k
stars
356
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
问一下,这种图片中的公式和文本没有识别出来的情况,能通过调整参数,让识别更精确吗?
#118
StrGlee
opened
1 month ago
2
size mismatch for model
#117
H-0906
closed
1 month ago
5
Model and pre-trained model parameters do not match!!!
#116
MtrsChJG
closed
1 month ago
3
Font path fix
#115
jorgeolothar
opened
1 month ago
1
Refactoring of `pdf_extract.py` script
#114
AdevGarcia
opened
1 month ago
0
在线体验端pdf识别结果问题
#113
X17exe
closed
1 month ago
0
正文文本提取和表格文本提取
#112
kendrickliu
opened
1 month ago
1
表格文本latex及段落顺序
#111
kendrickliu
opened
1 month ago
1
feat: add num-workers parameter
#110
jorgeolothar
closed
1 month ago
1
iopath version
#109
vadim0x60
opened
1 month ago
3
离线部署paddleocr模型下载及存放地址
#108
724852499
closed
1 month ago
2
表格识别权重
#107
Note-Liu
closed
2 months ago
0
Inference Accelerated PDF batch parsing
#106
veya2ztn
opened
2 months ago
1
流程图、组织架构图、泳道图等有无做过优化?
#105
shaoerkuai
opened
2 months ago
1
FileNotFoundError: [Errno 2] No such file or directory: 'models/MFD/weights.pt'
#104
jcui2001
closed
1 month ago
4
无法在M3 Pro芯片上启用mps
#103
Zi-Gao
opened
2 months ago
0
feat: add layoutlmv3 export onnx
#102
Joker1212
closed
1 month ago
0
CUDA Version: 11.4 有没有解决的方法呢?
#101
chenzebiaohub
opened
2 months ago
0
Fix the display issues in the Chinese document.
#100
peilongchencc
opened
2 months ago
0
Charisma
#98
ichcharisma
closed
2 months ago
0
有网页demo可以试用吗
#97
KungFuPandaPro
closed
2 months ago
1
双栏文档解析结果中阅读顺序错误,并且有部分内容遗失。可以优化一下阅读顺序吗?
#96
Maple0709
opened
2 months ago
4
执行 报 Illegal instruction (core dump ed) 错
#95
724852499
opened
2 months ago
3
CPU环境执行到paddle ocr报错
#94
Schumpeterx
closed
2 months ago
2
执行时报错
#93
lmolhw5252
opened
2 months ago
1
fix(ocr): Solve the issue of missing some lines and spans due to adhesion during OCR
#92
myhloli
closed
2 months ago
0
fix(ocr): Solve the issue of missing some lines and spans due to adhesion during OCR
#91
myhloli
closed
2 months ago
0
_pickle.UnpicklingError: invalid load key, 'v'. 错误
#90
724852499
opened
2 months ago
13
feat: add batch-size parameter and garbage collection
#89
jorgeolothar
closed
2 months ago
1
fix & refactor & docs:update ocr logic and installation guides
#88
myhloli
closed
2 months ago
0
How to outputs text in a human-readable order
#87
SidneyRey
closed
2 months ago
1
docs: update installation guides and requirements
#86
myhloli
closed
2 months ago
0
AttributeError: 'CustomMBartDecoder' object has no attribute 'embed_scale'出现这个报错
#85
SidneyRey
closed
2 months ago
1
Possibile to get dataset?
#84
ajkdrag
closed
2 months ago
1
修复(extract_pdf):防止大图像的过度缩放
#83
myhloli
closed
2 months ago
0
refactor(pdf_extract): remove hardcoded paste values in crop_img function
#82
myhloli
closed
2 months ago
0
布局检测模型的推理时间很久
#81
liujiachang
opened
2 months ago
7
refactor ocr and table recognition logic
#80
myhloli
closed
2 months ago
0
表格解析的时间非常长,并且md文件中有乱码
#79
TQC10
opened
2 months ago
1
paddlepaddle报错
#77
shenjunyu57
closed
2 months ago
1
cudnn and cuda版本
#76
shenjunyu57
closed
2 months ago
1
关于StructEqTable for table recognition;
#75
cookieswolf
closed
2 months ago
3
No stack trace in paddle, may be caused by external reasons
#74
Pioneer-Weirdo
opened
3 months ago
1
layout部分有训练代码吗 还是说用微软官方的?
#73
lawliet1777
closed
2 months ago
3
LayoutLMv3-SFT相比paddleocr的layout模型有何优势?
#72
ConleyKong
closed
3 months ago
1
layout的模型pth怎么转成onnx
#71
qrsssh
opened
3 months ago
18
Refine Table Recognition Tutorial
#70
wangbinDL
closed
3 months ago
0
运行表格识别DEMO报如下错误
#69
kevinwei1975
opened
3 months ago
2
Update Tutorial on Table Recognition
#68
sky-fly97
closed
3 months ago
0
Merge the latest changes from `main` into `dev`
#67
wangbinDL
closed
3 months ago
0
Previous
Next