miller-center / cpc-issues

Connecting Presidential Collections
Other
0 stars 0 forks source link

Explore the state of handwriting recognition and OCR #21

Open waldoj opened 10 years ago

waldoj commented 10 years ago

The 2012 International Conference on Frontiers in Handwriting Recognition (ICFHR) published a report on the state of the art that I think will be helpful. They have articles on word spotting, word segmentation, character classification, and more.

waldoj commented 10 years ago

Added challenge: nonstandard period spelling.

waldoj commented 10 years ago

I recommend very highly "Text Recognition in Printed Historical Documents", by Twan van Laarhoven. In it he describes his theoretical OCR system, "The Emmius OCR System," which has components that we might do well to implement.

waldoj commented 10 years ago

Via the folks at NYPL comes Tandem HMM with convolutional neural network for handwritten word recognition, published in May, which represents a real breakthrough.

sblackford commented 10 years ago

This is a document from an OCR Summit Meeting (http://idhmc.tamu.edu/ocr-summit-meeting/) that includes a list of participants, http://idhmc.tamu.edu/commentpress/participants/, that might come in helpful.

sblackford commented 10 years ago

The British Library has an interesting project that we should follow. More here in their blog entry: http://britishlibrary.typepad.co.uk/digital-scholarship/2013/12/a-million-first-steps.html.