FOTS: Fast Oriented Text Spotting with a Unified Network

https://arxiv.org/pdf/1801.01671.pdf

Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document anal- ysis community. Most existing methods treat text detec- tion and recognition as separate tasks. In this work, we propose a unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks. Specially, RoIRotate is introduced to share convolutional features between de- tection and recognition. Benefiting from convolution shar- ing strategy, our FOTS has little computation overhead compared to baseline text detection network, and the joint training method learns more generic features to make our method perform better than these two-stage methods. Ex- periments on ICDAR 2015, ICDAR 2017 MLT, and ICDAR 2013 datasets demonstrate that the proposed method out- performs state-of-the-art methods significantly, which fur- ther allows us to develop the first real-time oriented text spotting system which surpasses all previous state-of-the- art results by more than 5% on ICDAR 2015 text spotting task while keeping 22.6 fps.

wanghaisheng / awesome-ocr

FOTS: Fast Oriented Text Spotting with a Unified Network #80