Open osuossu8 opened 1 year ago
In the second stage, we build two relatively small datasets corresponding to printed and handwritten downstream tasks, containing millions of textline images each.
Pre-training Dataset
stage 1
stage2
Benchmark
The SROIE (Scanned Receipts OCR and Information Extraction) dataset (Task 2) focuses on text recognition in receipt images
The IAM Handwriting Database
Recognizing scene text images
https://arxiv.org/pdf/2109.10282.pdf