Open altomator opened 7 years ago
IIIF exposes OCR text as annotations on images. But OCR text is generally produced by OCR systems with a structure at character/word/line/paragraph levels
--> I would like to get OCR text on a specific level
Example: A marginal note (http://gallica.bnf.fr/ark:/12148/bpt6k96006893/f20) recognized by the OCR as 2 paragraphs: http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/529,1076,287,203/full/0/native.jpg http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/526,1281,287,123/full/0/native.jpg
Let the user the ability to choose the granularity of the OCR text
ALTO and IIIF on-going work
If the solution is let the user pick, then it seems solved? Make layers with lots of annotations, and put a description/label as to which level it is?
related to https://github.com/IIIF/iiif.io/issues/764
Description
IIIF exposes OCR text as annotations on images. But OCR text is generally produced by OCR systems with a structure at character/word/line/paragraph levels
--> I would like to get OCR text on a specific level
Example: A marginal note (http://gallica.bnf.fr/ark:/12148/bpt6k96006893/f20) recognized by the OCR as 2 paragraphs: http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/529,1076,287,203/full/0/native.jpg http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/526,1281,287,123/full/0/native.jpg
Proposed Solutions
Let the user the ability to choose the granularity of the OCR text
Additional Background
ALTO and IIIF on-going work