A synthetic benchmark database for scene text removal is now released by Deep Learning and Vision Computing Lab of South China University of Technology. The database can be downloaded through the following links:
Google Driver: (link: https://drive.google.com/open?id=1l_yJm1vWV7TF7vDcaVa7FqZLfW7ASYeo) (Size = 6.3G).
On the other hand, we collected 1000 images from the ICDAR 2017 MLT subdataset which only contains English text to enlarge real data, and the background image(label) is generated by manually erasing the text. The database can be downloaded through the following links:
The training set of synthetic database consists of a total of 8000 images and the test set contains 800 images; all the training and test samples are resized to 512 × 512. The code for generating synthetic dataset and more synthetic text images as described in “Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, Synthetic Data for Text localisation in Natural Images, CVPR 2016", and can be found in (https://github.com/ankush-me/SynthText). Besides, all the real scene text images are also resized to 512 × 512.
For more details, please refer to our AAAI 2019 paper. arXiv: http://arxiv.org/abs/1812.00723
git clone https://github.com/HCIILAB/Scene-Text-Removal
You can refer to our given example to put data.
To train our model, you may need to change the path of dataset or the parameters of the network etc. Then run the following code:
python train.py \
--trainset_path=[the path of dataset] \
--checkpoint=[path save the model] \
--gpu=[use gpu] \
--lr=[Learning Rate] \
--n_epoch=[Number of iterations]
To output the generated results of the inputs, you can use the test.py. Please run the following code:
python test.py \
--test_image=[the path of test images] \
--model=[which model to be test] \
--vis=[ vis images] \
--result=[path to save the output images]
To evalution the model performace over a dataset, you can find the evaluation metrics in this website PythonCode.zip
Please download the ImageNet pretrained models vgg16 PASSWORD:8tof, and put it under
root/.mxmet/models/
Please consider to cite our paper when you use our database:
@article{zhang2019EnsNet,
title = {EnsNet: Ensconce Text in the Wild},
author = {Shuaitao Zhang∗, Yuliang Liu∗, Lianwen Jin†, Yaoxiong Huang, Songxuan Lai
joural = {AAAI}
year = {2019}
}
Suggestions and opinions of dataset of this dataset (both positive and negative) are greatly welcome. Please contact the authors by sending email to eestzhang@mail.scut.edu.cn.
The synthetic database can be only used for non-commercial research purpose.
For commercial purpose usage, please contact Dr. Lianwen Jin: lianwen.jin@gmail.com.
Copyright 2018, Deep Learning and Vision Computing Lab, South China University of Teacnology.http://www.dlvc-lab.net