fh2019ustc / DocScanner

The official repo for “DocScanner: Robust Document Image Rectification with Progressive Learning”.
Other
165 stars 21 forks source link
document-image-dewarping document-image-processing document-image-rectification ocr

🔥 2024.4.28: Good news! The code and pre-trained model of DocScanner are now released!

🚀 Good news! The online demo for DocScanner is now live, allowing for easy image upload and correction.

🔥 Good news! Our new work DocTr++: Deep Unrestricted Document Image Rectification comes out, capable of rectifying various distorted document images in the wild.

🔥 Good news! A comprehensive list of Awesome Document Image Rectification methods is available.

DocScanner

This is a PyTorch/GPU re-implementation of the paper DocScanner: Robust Document Image Rectification with Progressive Learning.

image

🚀 Demo (Link)

Note:The model version used in the demo corresponds to "DocScanner-L" as described in the paper.

  1. Upload the distorted document image to be rectified in the left box.
  2. Click the "Submit" button.
  3. The rectified image will be displayed in the right box.
image

Examples

image image

Training

Inference

  1. Put the pre-trained DocScanner-L to $ROOT/model_pretrained/.
  2. Put the distorted images in $ROOT/distorted/.
  3. Run the script and the rectified images are saved in $ROOT/rectified/ by default.
    python inference.py

Evaluation

Method MS-SSIM LD Li-D ED (Setting 1) CER ED (Setting 2) CER Para. (M)
DocScanner-T 0.5123 7.92 2.04 501.82 0.1823 809.46 0.2068 2.6
DocScanner-B 0.5134 7.62 1.88 434.11 0.1652 671.48 0.1789 5.2
DocScanner-L 0.5178 7.45 1.86 390.43 0.1486 632.34 0.1648 8.5

Citation

Please cite the related works in your publications if it helps your research:

@inproceedings{feng2021doctr,
  title={DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction},
  author={Feng, Hao and Wang, Yuechen and Zhou, Wengang and Deng, Jiajun and Li, Houqiang},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={273--281},
  year={2021}
}
@inproceedings{feng2022docgeonet,
  title={Geometric Representation Learning for Document Image Rectification},
  author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Wang, Yuechen and Li, Houqiang},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2022}
}
@article{feng2021docscanner,
  title={DocScanner: robust document image rectification with progressive learning},
  author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Tian, Qi and Li, Houqiang},
  journal={arXiv preprint arXiv:2110.14968},
  year={2021}
}

Acknowledgement

The codes are largely based on DocUNet and DewarpNet. Thanks for their wonderful works.

Contact

For commercial usage, please contact Professor Wengang Zhou (zhwg@ustc.edu.cn) and Hao Feng (haof@mail.ustc.edu.cn).