IIIF / iiif-stories

Community repository for documenting stories and use cases related to uses of the International Image Interoperability Framework.
21 stars 0 forks source link

I would like to access the OCR text at a specific granularity #77

Open altomator opened 7 years ago

altomator commented 7 years ago

Description

IIIF exposes OCR text as annotations on images. But OCR text is generally produced by OCR systems with a structure at character/word/line/paragraph levels

--> I would like to get OCR text on a specific level

Example: A marginal note (http://gallica.bnf.fr/ark:/12148/bpt6k96006893/f20) recognized by the OCR as 2 paragraphs: http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/529,1076,287,203/full/0/native.jpg http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/526,1281,287,123/full/0/native.jpg

Proposed Solutions

Let the user the ability to choose the granularity of the OCR text

Additional Background

ALTO and IIIF on-going work

azaroth42 commented 7 years ago

If the solution is let the user pick, then it seems solved? Make layers with lots of annotations, and put a description/label as to which level it is?

tomcrane commented 7 years ago

related to https://github.com/IIIF/iiif.io/issues/764