Recommend: TextBoxes++ is an extended work of TextBoxes, which supports oriented scene text detection. The recognition part is also included in TextBoxes++.
This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard nonmaximum suppression. For more details, please refer to our paper.
Please cite TextBoxes in your publications if it helps your research:
@inproceedings{LiaoSBWL17,
author = {Minghui Liao and
Baoguang Shi and
Xiang Bai and
Xinggang Wang and
Wenyu Liu},
title = {TextBoxes: {A} Fast Text Detector with a Single Deep Neural Network},
booktitle = {AAAI},
year = {2017}
}
Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT
git clone https://github.com/MhLiao/TextBoxes.git
cd TextBoxes
make -j8
make py
The reference xml file is as following:
<?xml version="1.0" encoding="utf-8"?>
<annotation>
<object>
<name>text</name>
<bndbox>
<xmin>158</xmin>
<ymin>128</ymin>
<xmax>411</xmax>
<ymax>181</ymax>
</bndbox>
</object>
<object>
<name>text</name>
<bndbox>
<xmin>443</xmin>
<ymin>128</ymin>
<xmax>501</xmax>
<ymax>169</ymax>
</bndbox>
</object>
<folder></folder>
<filename>100.jpg</filename>
<size>
<width>640</width>
<height>480</height>
<depth>3</depth>
</size>
</annotation>
Please let me know if you encounter any issues.