fengxinjie / Transformer-OCR

MIT License
320 stars 74 forks source link

You should carefully read the SVTP images. #5

Open delveintodetail opened 4 years ago

delveintodetail commented 4 years ago

SVTP has 645 images, if you carefully read each one, you will find there are around 20 images humans cannot recognize. How can this algorithm get 98.6%??

When seeing such so supervised improvement to the SOTA, you should carefully check your code.

fengxinjie commented 4 years ago

SVTP has 645 images, if you carefully read each one, you will find there are around 20 images humans cannot recognize. How can this algorithm get 98.6%??

When seeing such so supervised improvement to the SOTA, you should carefully check your code.

Thank you for your reminding. Maybe there is something wrong with my code. I'll check it carefully sometime. As I am temporarily engaged in work, I am not very rich in time. There may be a delay. Sorry.

delveintodetail commented 4 years ago

There are several papers that adopt transformer on OCR.

  1. NRTR: A no-recurrence sequence-to-sequence model for scene text recognition
  2. MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
  3. A Simple and Strong Convolutional-Attention Network for Scene Text Recognition
fengxinjie commented 4 years ago

There are several papers that adopt transformer on OCR.

  1. NRTR: A no-recurrence sequence-to-sequence model for scene text recognition
  2. MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
  3. A Simple and Strong Convolutional-Attention Network for Scene Text Recognition

Thanks very much, I will read later.