Handle multi image annotation

rsimon commented 3 years ago

We need to figure out what's needed to support annotation of a multi-tilesource setup. The key question to answer is how to get proper image coordinates.

rsimon commented 3 years ago

A workspace/playground repo for this is at: https://github.com/IIIF-Commons/osd-canvasworld-annotorious/

Code Sandbox: https://codesandbox.io/s/github/IIIF-Commons/osd-canvasworld-annotorious

A development branch is currently set up under multi-image.

umesh-timalsina commented 3 years ago

@rsimon Any progress on this front? I am currently trying to bulk annotate multiple tilesources loaded in a single viewport and getting the following error:

Uncaught TypeError: Cannot read property 'naturalWidth' of undefined
    at oa (openseadragon-annotorious.min.js:2)
    at new e (openseadragon-annotorious.min.js:2)
    at i.startDrawing (openseadragon-annotorious.min.js:2)
    at i.start (openseadragon-annotorious.min.js:2)
    at $.MouseTracker.pressHandler (openseadragon-annotorious.min.js:2)
    at updatePointersDown (openseadragon.js:5930)
    at onPointerDown (openseadragon.js:5472)
    at SVGSVGElement.pointerdown (openseadragon.js:3340)

This is what my viewport looks like:

umesh-timalsina commented 3 years ago

I will note here that the error message above seems to occur when you initialize annotoriousOSD without any tileSources to begin with. The images are added to the viewPort only after the plugin is initialized in the example above.

rsimon commented 3 years ago

Unfortunately, there's no progress on this yet. I've been wanting to look into this. But frankly, it kept dropping down the priority list, since none of my day job projects currently needs it.

I think it might be a bigger task. Right now, Annotorious implicitly assumes that the OSD world coordinate space equals the image coordinate space; and the dimensions are initialized when the Annotorious instance is created. (Which is exactly in line with the error messages you are seeing.)

To address this, we'd first need to figure out the right way to translate between world coordinates and image coordinates (and vice versa).

Then, whenever an annotation is created, we'd need to find a way to figure out which tilesource it belongs to (there should be an OSD API method for that, I guess).

The rest is plumbing. It's probably not entirely trivial either. But once there's a first working hack, at least we'll be able to see what it takes to build this.

As I said: I'd be really interested too see (and help with!) this. But my gut feeling is that actually building this exceeds my weekend side project time budget for the moment...

umesh-timalsina commented 3 years ago

@rsimon Thanks for the reply. I am willing to add this feature. Few suggestions/questions for the current API?

I think it might be a bigger task. Right now, Annotorious implicitly assumes that the OSD world coordinate space equals the image coordinate space; and the dimensions are initialized when the Annotorious instance is created. (Which is exactly in line with the error messages you are seeing.)

Can we leverage the OpenSeaDragon.viewer.world's add-item event for getting the dimensions? I tried it locally with my fork for the demo in this repository. Everything seem to work fine for a single image. Admittedly, I haven't looked into DrawingTools code to know how it works. Is it the case that it requires a single image source url for a single drawer instance?

To address this, we'd first need to figure out the right way to translate between world coordinates and image coordinates (and vice versa).

I found few API points in OpenSeadragon's TiledImage class: (a.) viewPortToImageCoordinates

(b.) imageToViewPortCordinates

Then, whenever an annotation is created, we'd need to find a way to figure out which tilesource it belongs to (there should be an OSD API method for that, I guess).

I have experimented with a way to figure this out. But it might not be optimal, On any mouse event, you can detect which tiled image source's bounding box the coordinates belong to. Although, it might be tricky to detect annotations drawn over boundaries.

The rest is plumbing. It's probably not entirely trivial either. But once there's a first working hack, at least we'll be able to see what it takes to build this.

Agreed, but I would like to know what details would be necessary, I am fairly constrained in terms of time, albeit we can cross the bridge when we get there :). Thanks.

rsimon commented 3 years ago

Yes, the add-overlay event for getting dimensions sounds right.
viewportToImageCoordinates (and vice versa) doesn't work in multi-image mode (see docs here). But there are similar methods on the TiledImage sources. Once you've figured out which tilesource the annotation is on, you should be able to use those.

The main problem for the plumbing is this: AnnotoriousOSD re-uses the drawing tools from "standard" Annotorious. Therefore, the tools need some kind of common abstraction for the underlying image. That's simple as long as both are only dealing with a single image. (The tools really just expect an object that has naturalWidth and naturalHeight properties.) But things get much trickier with OSD multi-image mode.

Changes are (at least) necessary:

here, where width and height are stored, so that the tools can later access them; and
here in the base class for tool implementations, where mouse pointer coordinates are translated to image coordinates

For the first part, I think it's sufficient to store the size of the OSD world (and re-size it when a new tilesource is added). The second part is the tricky bit & needs experimentation. AnnotoriousOSD will sometimes need to convert between mouse and world coordinates (to draw the shapes); but sometimes between mouse (or world) and tilesource image coordinates, to store and render annotations. Wouldn't know off the top of my head what's needed where exactly in the code. Figuring that out might be a bit of painful step-by-step process. But any help you can give would be massively appreciated!

umesh-timalsina commented 3 years ago

Further update on this: There are two modes currently used in OSD that might be of concern for this feature support:

Sequence Mode

For sequence mode, there are multiple tile sources but the OSDViewer opens only one image at a time. Currently, with the calculations made for drawing annotations work fine and the open event is fired for every page change in sequence mode. However, the currently the annotation layer doesn't take into account the underlying source for which the annotation is loaded. So, if you annotate an image and switch pages in the viewer, the annotations for previous image are displayed as well (Example below):

Page 1

Page 2

Since the base calculations for annotating the image works and events are fired correctly, we need the following additions to support it in My Opinion:

On page change of the viewer, partition the list of annotations into displayed vs hidden based on target.source url for the annotation i.e. if the url of the current tile source in the OSDViewer matches with the target.source, it goes to displayed pool vs hidden pool for the opposite case.
Handle all the events like cancelSelection if there's an active selection in the viewPort.

This approach works only for sequence mode. I have a reasonable solution for the partitioning approach described above based on query selectors but I didn't find any other way to hide an annotation based on my understanding of the annotorious's public API.

Collection Mode

Still investigating the collection mode, but I did fiddle around with the following pen that does the hit detection in collection mode. Based on the pen and the coordinate translation functions mentioned above, if we can find a way to map the mouse coordinates to the image coordinates, I think we have a decent starting point.

Update 1:

https://umesh-timalsina.github.io/osd-coordinates/ Has an implementation for OSD Coordinates in collection mode, its a simple demo and might not cover all the use cases.

rsimon commented 3 years ago

Thanks so much for digging into this! I'll can probably look into this more closely on the weekend. But meanwhile I also wanted to point to issue #52, which is specifically about sequence mode.

My take on this is that Annotorious shouldn't do too much in this case, and leave replacing of the annotations up to the host application. (Annotorious doesn't check whether the URI of the image and the source attribute match up. And there has been discussion about this causing potential pitfalls.) I.e. the core probably should just make sure the image dimensions get updated accordingly, and the annotation layer gets wiped.

As for the common functionality of paginating the annotations and applying the right subset: perhaps that's something better handled in a (very) small "sequenceMode plugin". This way, users of sequence mode wouldn't need to re-write the same pagination code themselves, while the average user (who isn't using sequence mode) wouldn't need to deal with nasty surprises if image URL and their annotations' source attribute don't match up.

umesh-timalsina commented 3 years ago

Further thoughts on this, its not clear that how multiple images should be handled by the underlying annotation layer, If there are multiple images in the ViewPort, one idea is to create multiple annotation layers; one for for each image. And tie the boundary of the annotation layer with the Image Bounding Box as viewport coordinates. Since, we can get Image-Coordinates from the viewport coordinates, Further thoughts on this might be necessary, but I think this will be a good way to handle individual images in the viewport. This way any annotations drawn on boundaries will be taken care of automatically.

rsimon commented 3 years ago

This sounds like a good approach to managing multiple images. I think this should also allow the re-use of the drawing tools without modification.

Extensions to the Annotator will probably have to be substantial. But it's definitely worth a try.

umesh-timalsina commented 3 years ago

For anybody willing to try this feature, a super primitive version has been implemented in this fork. Upon further reflection on this issue, this feature can also be added as a plugin in my opinion (like the sequence mode plugin). If we are able to hookup a custom annotation layer to the app, that would work as well, however keeping two annotation layers in sync would be a maintenance headache. I don't know what the plans for this project are, but supporting multi-image annotation has become a business requirement for one of our projects.

Regardless, the biggest roadblock for me is tying an individual image in the view-port to a particular annotation layer. This is absolutely necessary for setAnnotations method to discover which image is loaded in which tile source. The branch above takes a URL based approach.

rsimon commented 3 years ago

@umesh-timalsina sorry I didn't look into your fork more deeply yet. But this seems to work pretty well already!

My initial feeling would have been that separating this into two annotation layers (=two separate SVG overlays) could make this easier. Or at least allow to break the code up a bit (into the actual layer, and then some kind of managing class "above" in the hierarchy.) That could make it easier to restrict drawing to a specific image, since you could set the size of the SVG the same as that of the image. (And then maybe even highlight the image region/border as the user moves the mouse over it, so people know which image they are on.)

But perhaps you are right that the need to keep things in sync raises lots of other issues. Do you mind sending a pull request for this? (We could, e.g. pull it into a dedicated branch while the architecture might still be changing.)

rsimon commented 3 years ago

On second thought, constraining already works pretty well already at the level of the <g> elements, with the image size set through naturalWidth/Height. My feeling is that somehow splitting the OSDAnnotationLayer into two parts might be useful and aid readability. (And maybe provide a clue whether this really could be solved in a plugin-based way, too.)

But IMO this works great, and it looks like it's really only about a bit of refactoring more than anything else!

umesh-timalsina commented 3 years ago

@rsimon Could you elaborate more on splitting OSDAnnotationLayer into two parts? I have opened a PR with the changes I made (#67).

rsimon commented 3 years ago

Thanks for the PR!

I think what might work is to move all the code that relates to a single TiledImage into a separate class. Maybe something called "AnnotatableImage" perhaps. That would hold the ref to the SVG <g>, the tool, the event handlers, methods to encapsulate the queries "is this point on this image", "is this annotation on this image" etc.

And then the OSDAnnotationLayer would hold a list of AnnotatableImages, and handle the general management - deferred annotation loading, resizing the SVG layer etc.

umesh-timalsina commented 3 years ago

@rsimon Thanks for keeping the feature branch here. I will try to incorporate what you suggested, it makes sense to have an AnnotableImage class instead of iterating through a list every time.

annotorious / annotorious-openseadragon