🚀 Exciting update! We have created a demo for our paper on Hugging Face Spaces, showcasing the capabilities of our DocTr. Check it out here!
🔥 Good news! Our new work DocTr++: Deep Unrestricted Document Image Rectification comes out, capable of rectifying various distorted document images in the wild.
🔥 Good news! Our new work exhibits state-of-the-art performances on the DocUNet Benchmark dataset: DocScanner: Robust Document Image Rectification with Progressive Learning with Repo.
🔥 Good news! A comprehensive list of Awesome Document Image Rectification methods is available.
Geometric Representation Learning for Document Image Rectification
ECCV 2022
Any questions or discussions are welcomed!
$ROOT/model_pretrained/
.$ROOT/distorted/
and output the rectified images in $ROOT/rec/
:
python inference.py
$ROOT/ssim_ld_eval_DocUNet.m
and $ROOT/ssim_ld_eval_DIR300.m
for the DocUNet and DIR300 Benchmark, respectively.$ROOT/ocr_img_DocUNet.txt
(Setting 1, following DocTr). Please refer to DewarpNet for the index of 25 document (50 images) of DocUNet Benchmark used for their OCR evaluation (Setting 2). We provide the OCR evaluation code at $ROOT/OCR_eval_DocUNet.py
and $ROOT/OCR_eval_DIR300.py
for the DocUNet and DIR300 Benchmark, respectively. The version of pytesseract is 0.3.8, and the version of Tesseract in Windows is recent 5.0.1.20220118.
Note that in different operating systems, the calculated performance has slight differences.Benchmark Dataset | Method | MS-SSIM | LD | ED (Setting 1) | CER | ED (Setting 2) | CER |
---|---|---|---|---|---|---|---|
DocUNet | DocGeoNet | 0.5040 | 7.71 | 379.00 | 0.1509 | 713.94 | 0.1821 |
Benchmark Dataset | Method | MS-SSIM | LD | ED | CER |
---|---|---|---|---|---|
DIR300 | DocGeoNet | 0.6380 | 6.40 | 664.96 | 0.2189 |
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{feng2022docgeonet,
title={Geometric Representation Learning for Document Image Rectification},
author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Wang, Yuechen and Li, Houqiang},
booktitle={Proceedings of the European Conference on Computer Vision},
year={2022}
}
@inproceedings{feng2021doctr,
title={DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction},
author={Feng, Hao and Wang, Yuechen and Zhou, Wengang and Deng, Jiajun and Li, Houqiang},
booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
pages={273--281},
year={2021}
}
@article{feng2021docscanner,
title={DocScanner: Robust Document Image Rectification with Progressive Learning},
author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Tian, Qi and Li, Houqiang},
journal={arXiv preprint arXiv:2110.14968},
year={2021}
}
The codes are largely based on DocUNet and DewarpNet. Thanks for their wonderful works.
For commercial usage, please contact the email (haof@mail.ustc.edu.cn).