Open altomator opened 7 years ago
From the perspective of an OCR correction platform, I (the correction tool) would like to
So far, people have been using seeAlso to link from canvas to ALTO:
"seeAlso": {
"@id": "http://wellcomelibrary.org/service/alto/b22014068/0?image=11",
"format": "text/xml",
"profile": "http://www.loc.gov/standards/alto/v3/alto.xsd",
"label": "METS-ALTO XML"
}
The Newspaper working group have some guidelines around this - https://www.slideshare.net/kestlund/newspapers-iiif-and-alto
This could also be modelled as a service.
My concern is that accessing the right element in the OCR file from the text annotation is not an straightforward process (using the geometrical information?)
{
"@id":"http://dams.llgc.org.uk/iiif/3320863/annotation/5014243419640",
"@type":"oa:Annotation",
"motivation":"sc:painting",
"resource":
{
"@type":"cnt:ContentAsText",
"format":"text/plain",
"chars":"NEWS."
},
"on":"http://dams.llgc.org.uk/iiif/3320860/canvas/3320863#xywh=5014,2434,196,40"
},
I suppose that for this specific use case (getting access to the XML stuff), we need another annotations list to reference XML external segments (http://iiif.io/api/presentation/2.1/#segments):
{
"@context": "http://iiif.io/api/presentation/2/context.json",
"@id": "http://example.org/iiif/book1/annotation/anno1",
"@type": "oa:Annotation",
"motivation": "sc:painting",
"resource":{
"@id": "http://example.org/iiif/book1/res/alto.xml#xpointer(//String[@id='Str_001'])",
"@type": "dctypes:Text",
"format": "application/alto+xml"
},
"on": "http://example.org/iiif/book1/canvas/p1#xywh=100,100,500,300"
}
Description
Some use cases need to get access to information stored in the OCR format:
For these use cases, getting access to the raw OCR objects (or reference to the...) from the IIIF annotation layer would be usefull.