jaberg / skdata

Data sets for machine learning in Python
http://jaberg.github.com/skdata/
473 stars 149 forks source link

Wanted: Robust OCR dataset (Photo OCR) IDAR 2003 / 2011 #18

Open npinto opened 12 years ago

npinto commented 12 years ago

http://yaroslavvb.blogspot.com/2009/08/new-robust-ocr-dataset.html

http://yaroslavvb.com/bib_digits_dataset.tar.gz

npinto commented 12 years ago

Related ? http://algoval.essex.ac.uk/icdar/Datasets.html

npinto commented 12 years ago

ICDAR 2003 is mentioned in:

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh, Tao Wang and Andrew Y. Ng In Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR 2011), 2011. http://www.cs.stanford.edu/people/ang/papers/icdar01-TextRecognitionUnsupervisedFeatureLearning.pdf

npinto commented 12 years ago

ICDAR 2011: http://robustreading.opendfki.de

http://robustreading.opendfki.de/wiki/SceneText http://www.cvc.uab.es/icdar2011competition/

npinto commented 12 years ago

Google Docs API for OCR (for comparison): http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#OCR

chiendo1010 commented 7 years ago

Thank you so much!