Looking for IIIF 3.0 annotation example

joesong168 commented 3 years ago

I've tried following annotationPage with no luck

{
    "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c",
    "type": "AnnotationPage",
    "items": [
        {
            "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c/a/043a7bb7-d77b-44bd-9517-71bf6f551a1a",
            "type": "Annotation",
            "motivation": "supplementing",
            "body": {
                "type": "TextualBody",
                "value": "authorA",
                "format": "text/plain",
                "language": "zh-Hants"
            },
            "target": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8#xywh=958,5101,493,493"
        },
        {
            "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c/a/24453f8f-9c8b-4c82-8ae9-ffa8a779f8a6",
            "type": "Annotation",
            "motivation": "supplementing",
            "body": {
                "type": "TextualBody",
                "value": "authorB",
                "format": "text/plain",
                "language": "zh-Hants"
            },
            "target": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8#xywh=968,4544,490,462"
        }
    ],
    "@context": "http://iiif.io/api/p/3/context.json"
}

jbaiter commented 3 years ago

Thanks for reporting,I think I've only really tested it with v2 annotations so far, although the code already has paths for v3 support. Should probably be only a question of fixing some small bugs/inconsistencies. Can you also provide a Manifest URL for your fixture so I can test it end-to-end?

HenryH09 commented 3 years ago

Hello Have you had the time to take a look into this ? I am having some issues as well with the Presentation API V3 semantics for the OCR data (supplementing annotation) => nothing is displayed not even the textoverlay tool box. You can find a complete manifest example in the official IIIF cookbook : https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-manifest.json (https://iiif.io/api/cookbook/recipe/0068-newspaper/)

jbaiter commented 3 years ago

Thanks for providing the full manifest, I'll try to find time to work on this, this week or next.

sauterl commented 3 years ago

Hi There

With the hope to help @jbaiter to find and fix the culprits faster, here are some findings we had debugging what goes wrong when using IIIF 3.0:

The plugin assumes the existence of a resources array on annotations, which was renamed to items in IIIF 3.0. See the according definition.
In [saga.js:83] there is a non-version specific @id (IIIF v2), which should be replaced by seeAlso.id ?? seeAlso['@id']
fetchExternalAnnotationResources is also very specific to IIIF v2, in particular the usage of resources and resource instead of items and body.
Last but not least, as far as I understood the IIIF v3 documentation, supplementing motivated Annotations could also directly link to external OCR resources (e.g. a hocr file), however processTextsFromAnnotations assumes directly OCR content. (i.e. the condition anno.motivation === 'supplementing' is too weak for this particular use.

sauterl commented 3 years ago

I'm not entirely sure whether this is proper IIIF v3 usage, please bare with me, but adding a seeAlso on each canvas with an id pointing towards an external OCR resource (i.e. hOcr or ALTO) the plugin works as intended with some fixes as outlined in my previous comment.

Particularly what did the trick for our use case and manifest is really to tweak the IIIF v3 condition.

@stone12379 For your use case, I guess the findings from the previous comment should already help a lot. However, as far as I can tell, for a more robust IIIF v3 support, processTextsFromAnnotations (as written above), needs a stricter condition in order to filter out external OCR resources (which it currently does not).

jbaiter commented 2 years ago

So, some long overdue updates on this front, sorry it took so long, thanks to everybody for the feedback!

The example from the IIIF Cookbook now renders the annotations, but:

By default it will use the ALTO in seeAlso for rendering ('proper' OCR always is preferrered to annotations in this plugin)
The Annotations are not line-level and thus text rendering is pretty much broken by design, since we rely on the text to be at least structured into lines for some rendering hints that make text selection in SVG work. Additionally, the segmentation in the annotations is not even at the word-level, some annotations contain parts of multiple words.
The annotations do not match the canvas, so the overlay does not match the underlying image. For example, the first word annotation is 84 at xywh=182,476,59,43, but it's actually at approximately xywh=143,377,51,39. I assume this happens because the annotations were generated 1:1 from the ALTO which targets a 4562x6282 image, while the IIIF Canvas is 3602x500. The plugin scales down the coordinates when it renders text from the ALTO XML, so it renders just fine. As per the spec annotations are always relative to the dimensions of the canvas they target, so this adjustment is not done for annotations. tl;dr The Annotations in the cookbook example are broken and should be fixed

For comparison, here are two screenshots, one showing the text rendering from the ALTO and one with the annotations:

ALTO text

![image](https://user-images.githubusercontent.com/608610/174956683-0fad5dec-39e5-4573-aea6-5d3d166a6070.png)

Annotation text

![image](https://user-images.githubusercontent.com/608610/174956789-8e4cc128-0acc-4e1d-b694-0aac3ca8272e.png)

I have pushed my changes to the iiifv3 branch, could you please test this version with your manifests @sauterl @joesong168?

glenrobson commented 1 year ago

Hi @jbaiter, we've recently updated the Newspaper recipe with the following changes:

(hopefully) Fixed the annotations
Moved the Alto to rendering rather than seeAlso
Changed the target of the annotations to include a link to the Manifest

Let us know if you spot any further problems.

Also would it be possible to add a iiif-content parameter to your demo so that we can pass in a manifest and include a link from the cookbook to your plugin? For info Mirador uses the following:

https://projectmirador.org/embed/?iiif-content=https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_title-collection.json

jbaiter commented 1 year ago

Thanks for letting me know! I've updated the code to also look at rendering to discover referenced OCR files and fixed some other IIIF3 stuff related to annotations along the way. The iiif-content parameter is now included in the demo as well.

The ALTO from the Cookbook example, however, doesn't fully match up with the Canvas anymore, something's off:

grafik

https://iiifv3--mirador-textoverlay.netlify.app/?iiif-content=https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_2-manifest.json

glenrobson commented 1 year ago

Thanks for including the iiif-content link and looking at the rendering! Ill see if I can figure what is going on with the ALTO. It was generated using tesseract but maybe I used the wrong sized image or something.

jbaiter commented 1 year ago

Found the problem: There's a mismatch between the Canvas size and the Image and OCR size:

Canvas: 3602x5000px
Image: 3517x5000px
OCR: 3517x5000px

glenrobson commented 1 year ago

That is weird! but thank you Ill look at updating the ALTO (and annotations).

dbmdz / mirador-textoverlay

Looking for IIIF 3.0 annotation example #186