issues
search
opendatalab
/
PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Apache License 2.0
4.57k
stars
302
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Font path fix
#115
jorgeolothar
opened
7 hours ago
0
Refactoring of `pdf_extract.py` script
#114
AdevGarcia
opened
1 day ago
0
在线体验端pdf识别结果问题
#113
X17exe
closed
1 day ago
0
正文文本提取和表格文本提取
#112
kendrickliu
opened
1 day ago
0
表格文本latex及段落顺序
#111
kendrickliu
opened
1 day ago
0
feat: add num-workers parameter
#110
jorgeolothar
opened
2 days ago
0
iopath version
#109
vadim0x60
opened
2 days ago
3
离线部署paddleocr模型下载及存放地址
#108
724852499
closed
6 hours ago
2
表格识别权重
#107
Note-Liu
closed
2 days ago
0
Inference Accelerated PDF batch parsing
#106
veya2ztn
opened
2 days ago
0
流程图、组织架构图、泳道图等有无做过优化?
#105
shaoerkuai
opened
2 days ago
0
FileNotFoundError: [Errno 2] No such file or directory: 'models/MFD/weights.pt'
#104
jcui2001
opened
3 days ago
1
无法在M3 Pro芯片上启用mps
#103
Zi-Gao
opened
3 days ago
0
feat: add layoutlmv3 export onnx
#102
Joker1212
opened
4 days ago
0
CUDA Version: 11.4 有没有解决的方法呢?
#101
chenzebiaohub
opened
5 days ago
0
Fix the display issues in the Chinese document.
#100
peilongchencc
opened
1 week ago
0
Charisma
#98
ichcharisma
closed
1 week ago
0
有网页demo可以试用吗
#97
KungFuPandaPro
closed
1 week ago
1
双栏文档解析结果中阅读顺序错误,并且有部分内容遗失。可以优化一下阅读顺序吗?
#96
Maple0709
opened
2 weeks ago
4
执行 报 Illegal instruction (core dump ed) 错
#95
724852499
opened
2 weeks ago
3
CPU环境执行到paddle ocr报错
#94
Schumpeterx
closed
1 week ago
2
执行时报错
#93
lmolhw5252
opened
2 weeks ago
1
fix(ocr): Solve the issue of missing some lines and spans due to adhesion during OCR
#92
myhloli
closed
1 week ago
0
fix(ocr): Solve the issue of missing some lines and spans due to adhesion during OCR
#91
myhloli
closed
2 weeks ago
0
_pickle.UnpicklingError: invalid load key, 'v'. 错误
#90
724852499
opened
3 weeks ago
13
feat: add batch-size parameter and garbage collection
#89
jorgeolothar
closed
1 week ago
1
fix & refactor & docs:update ocr logic and installation guides
#88
myhloli
closed
3 weeks ago
0
How to outputs text in a human-readable order
#87
SidneyRey
closed
2 weeks ago
1
docs: update installation guides and requirements
#86
myhloli
closed
3 weeks ago
0
AttributeError: 'CustomMBartDecoder' object has no attribute 'embed_scale'出现这个报错
#85
SidneyRey
closed
3 weeks ago
1
Possibile to get dataset?
#84
ajkdrag
closed
2 weeks ago
1
修复(extract_pdf):防止大图像的过度缩放
#83
myhloli
closed
3 weeks ago
0
refactor(pdf_extract): remove hardcoded paste values in crop_img function
#82
myhloli
closed
3 weeks ago
0
布局检测模型的推理时间很久
#81
liujiachang
opened
4 weeks ago
7
refactor ocr and table recognition logic
#80
myhloli
closed
4 weeks ago
0
表格解析的时间非常长,并且md文件中有乱码
#79
TQC10
opened
4 weeks ago
1
paddlepaddle报错
#77
shenjunyu57
closed
2 weeks ago
1
cudnn and cuda版本
#76
shenjunyu57
closed
2 weeks ago
1
关于StructEqTable for table recognition;
#75
cookieswolf
closed
2 weeks ago
3
No stack trace in paddle, may be caused by external reasons
#74
Pioneer-Weirdo
opened
1 month ago
1
layout部分有训练代码吗 还是说用微软官方的?
#73
lawliet1777
closed
4 weeks ago
3
LayoutLMv3-SFT相比paddleocr的layout模型有何优势?
#72
ConleyKong
closed
1 month ago
1
layout的模型pth怎么转成onnx
#71
qrsssh
opened
1 month ago
18
Refine Table Recognition Tutorial
#70
wangbinDL
closed
1 month ago
0
运行表格识别DEMO报如下错误
#69
kevinwei1975
opened
1 month ago
2
Update Tutorial on Table Recognition
#68
sky-fly97
closed
1 month ago
0
Merge the latest changes from `main` into `dev`
#67
wangbinDL
closed
1 month ago
0
Why not create an official Docker image?
#66
HANCHIE
closed
1 month ago
1
hugging face上重新下载的 layoutlmv3-base-chinese的权重没法用,命名完全不一样....
#65
sheng49
closed
1 month ago
1
requests.exceptions.SSLError: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /repos/ultralytics/assets/releases/tags/v8.2.0 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)')))
#64
shuangdengmen
closed
2 weeks ago
1
Next