Open joesong168 opened 3 years ago
Thanks for reporting,I think I've only really tested it with v2 annotations so far, although the code already has paths for v3 support. Should probably be only a question of fixing some small bugs/inconsistencies. Can you also provide a Manifest URL for your fixture so I can test it end-to-end?
Hello Have you had the time to take a look into this ? I am having some issues as well with the Presentation API V3 semantics for the OCR data (supplementing annotation) => nothing is displayed not even the textoverlay tool box. You can find a complete manifest example in the official IIIF cookbook : https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-manifest.json (https://iiif.io/api/cookbook/recipe/0068-newspaper/)
Thanks for providing the full manifest, I'll try to find time to work on this, this week or next.
Hi There
With the hope to help @jbaiter to find and fix the culprits faster, here are some findings we had debugging what goes wrong when using IIIF 3.0:
resources
array on annotations, which was renamed to items
in IIIF 3.0. See the according definition.@id
(IIIF v2), which should be replaced by seeAlso.id ?? seeAlso['@id']
fetchExternalAnnotationResources
is also very specific to IIIF v2, in particular the usage of resources
and resource
instead of items
and body
.supplementing
motivated Annotations could also directly link to external OCR resources (e.g. a hocr file), however processTextsFromAnnotations
assumes directly OCR content. (i.e. the condition anno.motivation === 'supplementing'
is too weak for this particular use.I'm not entirely sure whether this is proper IIIF v3 usage, please bare with me, but adding a seeAlso
on each canvas with an id
pointing towards an external OCR resource (i.e. hOcr or ALTO) the plugin works as intended with some fixes as outlined in my previous comment.
Particularly what did the trick for our use case and manifest is really to tweak the IIIF v3 condition.
@stone12379 For your use case, I guess the findings from the previous comment should already help a lot.
However, as far as I can tell, for a more robust IIIF v3 support, processTextsFromAnnotations
(as written above), needs a stricter condition in order to filter out external OCR resources (which it currently does not).
So, some long overdue updates on this front, sorry it took so long, thanks to everybody for the feedback!
The example from the IIIF Cookbook now renders the annotations, but:
seeAlso
for rendering ('proper' OCR always is preferrered to annotations in this plugin)84
at xywh=182,476,59,43
, but it's actually at approximately xywh=143,377,51,39
.
I assume this happens because the annotations were generated 1:1 from the ALTO which targets a 4562x6282 image, while the IIIF Canvas is 3602x500.
The plugin scales down the coordinates when it renders text from the ALTO XML, so it renders just fine. As per the spec annotations are always relative to the dimensions of the canvas they target, so this adjustment is not done for annotations.
tl;dr The Annotations in the cookbook example are broken and should be fixedFor comparison, here are two screenshots, one showing the text rendering from the ALTO and one with the annotations:
I have pushed my changes to the iiifv3
branch, could you please test this version with your manifests @sauterl @joesong168?
Hi @jbaiter, we've recently updated the Newspaper recipe with the following changes:
Let us know if you spot any further problems.
Also would it be possible to add a iiif-content
parameter to your demo so that we can pass in a manifest and include a link from the cookbook to your plugin? For info Mirador uses the following:
Thanks for letting me know! I've updated the code to also look at rendering
to discover referenced OCR files and fixed some other IIIF3 stuff related to annotations along the way.
The iiif-content
parameter is now included in the demo as well.
The ALTO from the Cookbook example, however, doesn't fully match up with the Canvas anymore, something's off:
Thanks for including the iiif-content link and looking at the rendering! Ill see if I can figure what is going on with the ALTO. It was generated using tesseract but maybe I used the wrong sized image or something.
Found the problem: There's a mismatch between the Canvas size and the Image and OCR size:
3602x5000px
3517x5000px
3517x5000px
That is weird! but thank you Ill look at updating the ALTO (and annotations).
I've tried following annotationPage with no luck