lanfeng4659 / STR-TDSL

82 stars 6 forks source link

Scene Text Retrieval via Joint Text Detection and Similarity Learning (CVPR2021)

This is the code of "Scene Text Retrieval via Joint Text Detection and Similarity Learning". For more details, please refer to our CVPR2021 paper.

image

This repo is inherited from maskrcnn-benchmark and follows the same license.

Chinese Street View Text Retrieval Dataset (CSVTR)

CSVTR consists of 23 pre-defined query words in Chinese and 1667 Chinese scene text images collected from the Google image search engine. Each image is annotated with its corresponding query word among the 23 pre-defined Chinese query words.

CSVTR could be downloaded from baidu disk(asjw) or google driver.

Trained models

The trained models could be downloaded from baidu disk(legq). (This model is only supported for English.)

Evaluation

1. prepare datasets

An example of the path of test images: ./datasets/IIIT_STR_V1.0/imgDatabase/img_000846.jpg

2. evaluate

run sh tools/test.sh

Training code (ToDo)

Other datasets

CTR could be downloaded from baidu disk(e860).

MLT-5k: This dataset is the subset of MLT2017(or MLT2019). Please refer to the code of extracting this subset. Place the original datasets to the dir (datasets/MLT2019), such as:

img path: ./datasets/MLT2019/train_images/tr_img_10000.jpg

gt path: ./datasets/MLT2019/train_gts/tr_img_10000.txt

Citing the related works

Please cite the related works in your publications if it helps your research:

@InProceedings{Wang_2021_CVPR,  
  author    = {Wang, Hao and Bai, Xiang and Yang, Mingkun and Zhu, Shenggao and Wang, Jing and Liu, Wenyu},  
  title     = {Scene Text Retrieval via Joint Text Detection and Similarity Learning},  
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},  
  month     = {June},  
  year      = {2021},  
  pages     = {4558-4567}  
}