Canjie-Luo / MORAN_v2

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition
MIT License
627 stars 152 forks source link
attention-mechanism image-deformation image-rectification scene-text scene-text-recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

Python 2.7
Python 3.6
Build Status
Build Status

MORAN is a network with rectification mechanism for general scene text recognition. The paper (accepted to appear in Pattern Recognition, 2019) in arXiv, final version is available now.

Here is a brief introduction in Chinese.

Recent Update

Improvements of MORAN v2:

Version
IIIT5K
SVT
IC03
IC13
SVT-P
CUTE80
IC15 (1811)
IC15 (2077)
MORAN v1 (curriculum training)*
91.2
88.3
95.0
92.4
76.1
77.4
74.7
68.8
MORAN v2 (one-stage training)
93.4
88.3
94.2
93.2
79.7
81.9
77.8
73.9

*The results of v1 were reported in our paper. If this project is helpful for your research, please cite our Pattern Recognition paper.

Requirements

(Welcome to develop MORAN together.)

We recommend you to use Anaconda to manage your libraries.

Or use pip to install the libraries. (Maybe the torch is different from the anaconda version. Please check carefully and fix the warnings in training stage if necessary.)

    pip install -r requirements.txt

Data Preparation

Please convert your own dataset to LMDB format by using the tool (run in Python 2.7) provided by @Baoguang Shi.

You can also download the training (NIPS 2014, CVPR 2016) and testing datasets prepared by us.

The raw pictures of testing datasets can be found here.

Training and Testing

Modify the path to dataset folder in train_MORAN.sh:

    --train_nips path_to_dataset \
    --train_cvpr path_to_dataset \
    --valroot path_to_dataset \

And start training: (manually decrease the learning rate for your task)

    sh train_MORAN.sh

Demo

Download the model parameter file demo.pth.

Put it into root folder. Then, execute the demo.py for more visualizations.

    python demo.py

Citation

@article{cluo2019moran,
  author    = {Canjie Luo and Lianwen Jin and Zenghui Sun},
  title     = {MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition},
  journal   = {Pattern Recognition}, 
  volume    = {90}, 
  pages     = {109--118},
  year      = {2019},
  publisher = {Elsevier}
}

Acknowledgment

The repo is developed based on @Jieru Mei's crnn.pytorch and @marvis' ocr_attention. Thanks for your contribution.

Attention

The project is only free for academic research purposes.