[2021] Vision Transformer for Fast and Efficient Scene Text Recognition

osuossu8 / paper-reading

6 stars 1 forks source link

Open osuossu8 opened 1 year ago

osuossu8 commented 1 year ago

osuossu8 commented 1 year ago

use synthetic data

MJSynth (MJ)
- 8.9M
- 1,400 different fonts
SynthText (ST)
- 5.5M
In the STR framework, each dataset contributes 50% to the total train dataset. Combining 100% of both datasets resulted to performance deterioration

osuossu8 commented 1 year ago