ImageMonkey / imagemonkey-core

ImageMonkey is an attempt to create a free, public open source image dataset.
https://imagemonkey.io
47 stars 10 forks source link

discussion - multi-user-polygon editing #51

Open dobkeratops opened 6 years ago

dobkeratops commented 6 years ago

I was thinking about how to get my own editing tool working online (currently I can only 'save' by copy-pasting JSON out of an alert box, lol), and quickly ran into this (I've not really built any web-apps before.)

So, LabelMe solves this problem as follows: you can only edit/delete polygons that you added. all other user's polygons are visible to you, but in read-only mode. Once you've added anything , it appears to be permanent unless you delete it (I presume their admins could cleanup?)

Does this lend itself to Git-like version control .. merges, versioning? ( would it even be possible to host the data IN GitHub, and programatically submit updates.. ?)

How about a 'lock' based approach: the server knows if 2 users are viewing the same image simultaneously; it would try to avoid that happening in the first place with a 'get next random image' that avoids clashes. If one user starts editing polys.. other users are precluded from touching that.

The problem with locks is someone might leave a load of browser pages open.. but you could get around that with a timeout?

What I imagine with the 'pro-mode' is long detailed sessions (e.g. images open for 10's of minutes at a time).

Could you have every user's polygons stored as they want them (as if they're just seperate annotation files), and the validation tool produces the filtered consensus view. Any user can ask for their own picture, or the current consensus.

I suppose it could even be down to alpha blending, if you imagine each users labels blended together (ambiguity really does show up as fuzzy labels), although the 'LabelGraph' could do better in figuring out some cases where users are submitting synonyms/refinements. (i.e. if one person says "these pixels are a Car", and another says "these pixels are a Hatchback" .. those are entirely consistent.)

I dont know much about what the database engines can do, what options you have. I wondered if I could get away with this just being a bunch of files holding each users submissions (no DB engine), and those files are accumulated to produce the final dataset.

What do you currently do? ... Perhaps your separated workflow idea avoids this problem because data is being submitted in smaller chunks, more orchestrated by the application?

One thing to point out is that actual multi-user realtime feedback could be more engaging, it would give you a sense of collaboration.. but that could be achieved other ways, e.g a stream of updates (you could just show an image in the corner of the latest uploaded labels)

bbernhard commented 6 years ago

I was thinking about how to get my own editing tool working online (currently I can only 'save' by copy-pasting JSON out of an alert box, lol), and quickly ran into this (I've not really built any web-apps before.)

Awesome - already looking forward to that ;-)

Does this lend itself to Git-like version control .. merges, versioning? ( would it even be possible to host the data IN GitHub, and programatically submit updates.. ?)

Interesting idea, never thought of it that way. But you are right, it has some aspects of a revision control system.

However, with Git as a storage system, I would see the following problems:

btw: If it's just about the versioning/history, we could probably do a lot with 'temporal tables'. ImageMonkey already has the concept of temporal tables built in (I recently wrote a little bit about it, in case you are interested [2]). With the temporal tables in place, we can already do stuff like that:

How about a 'lock' based approach: the server knows if 2 users are viewing the same image simultaneously; it would try to avoid that happening in the first place with a 'get next random image' that avoids clashes. If one user starts editing polys.. other users are precluded from touching that.

The problem with locks is someone might leave a load of browser pages open.. but you could get around that with a timeout?

The problems I see with locking are:

I think It's pretty hard to detect those cases, as we do not have any information that could identify a user uniquely. We probably can't even block the IP address, as it could be a IP which is shared among other users (Carrier grade NAT).

That's mainly the reason, why I came up with the (lock-less) separated workflow. With that you circumvent those problems a little bit. As the annotation task is always pretty specific (e.q "Annotate all bananas"), users probably won't need much time to accomplish the task until they move on to the next annotation task. So instead of locking the picture, a user just gets a random task assigned (e.q "Annotate all dogs"). If there are enough annotation tasks in the pool, the probability that two users are working on the same annotation task is pretty low and you don't need to worry about locks.

But in case there are really two users working on the same annotation task, then one user's annotation wins. The other one's annotation gets silently discarded. So for every picture and label only one annotation entry at a time exists in the database. That's why it is important, that all occurences are labeled. If a annotation entry is not correct (e.q not all dogs in the picture are annotated) then users would probably spot that in the "Validate Annotations" tab.

In case a picture has a really low validation rate (e.q "28 out of 30 people say, that not all occurences of a dog are annotated"), then the corresponding annotation(s) will be removed and users can again mark all occurences of a dog in the validation tab. So it's a little bit meant like a self regulating system with a feedback loop (at least in theory ;-))

But I totally understand, that such a separated workflow is not really working for external annotation tools. So we have to come up with a solution that doesn't require users to signup before they can contribute, but still supports the "traditional" annotation workflow.

Just some (random) thoughts: I think locking in general could work, if we keep the lock only for a specific time (e.q 10min). After that time the lock gets automatically removed (no matter if the user has committed the changes or not). Let's assume for a minuate we would do this.

Now, the only other problem I would see is, that someone abuses the locking mechanism to lock all the images in the database at the same time. So, would it be an option to create some API endpoints that are "(third party) developer - only" specific and protected with an API token? In order to create an API token, you would need to signup as a developer. As the API token is now bound to a developer's account it's easier to block the Token in case we detect abusage.

I dont know much about what the database engines can do, what options you have. I wondered if I could get away with this just being a bunch of files holding each users submissions (no DB engine), and those files are accumulated to produce the final dataset.

I think the file approach could work to a certain degree (probably as long as all the data fit into memory), but as soon as you reach that threshold it could get pretty slow. At least that's my (limited) experience with it.

I am working at a company, where we are facing a similar issue at the moment. We have thousands of configuration files (each of them containing only 50 - 100 parameters) and our biggest problem is file I/O. As soon as the number of files reach a certain threshold, reading, writing and aggregating data gets really slow and painful.

[1] https://github.com/Microsoft/GVFS [2] https://imagemonkey.io/blog/general/2017/10/28/ImageMonkey-v0.2.html

dobkeratops commented 6 years ago

I guess we can just hope clashes 'dont matter'. with hopefully a huge array of images, the chances of 2 people editing the same polygon simultaneously is just low; or, Validation combined with the labelme approach means mistakes are caught.

Haven't moved on this yet.

I've just refined the simple browser based idea a little discovering a few odds and ends ('save' now causes download of a JSON text file, which itself can be drag-dropped to 'load'.. thats a bit better than copy-pasting in a dialogue box). Still not enough for casual users though.

Merging I imagine that it could be kinda hard to automatically resolve merging conflicts. I mean we could probably prevent a lot of merging conflicts

I wonder if merges could be done as a validation task, beyond just validating either option ('which of these two polys best covers the object..')

I am working at a company, where we are facing a similar issue at the moment. We have thousands of configuration files (each of them containing only 50 - 100 parameters) and our biggest problem is file I/O.

I wonder if you could have a tree of aggregation, but maybe there's no way around scanning for the changes. (bucketed dirty flags?) Maybe here aggregation needn't be fast; the full dataset could be 'compiled' periodically prior to handing to lengthy training tasks. A change has to go through validation steps anyway? as such making it immiediately visible might be bad anyway

print a trend chart, which shows the number of validations over time

Thats definitely something I wish LabelMe had - an indication of activity, to re-assure you it's not a dead project

bbernhard commented 6 years ago

I guess we can just hope clashes 'dont matter'. with hopefully a huge array of images, the chances of 2 people editing the same polygon simultaneously is just low.

yeah, right. However, without any locking in place the probability that we get clashes increases if people are annotating a lot of objects in single a picture. But as long as clashes happen rarely I don't see a problem with it....if they happen too often, we probably think about a locking mechanism.

I wonder if merges could be done as a validation task, beyond just validating either option ('which of these two polys best covers the object..') That's a nice idea! If we get some clashes then that would be a real cool mechanism too resolve those ones :) Really like that!

I've just refined the simple browser based idea a little discovering a few odds and ends ('save' now causes download of a JSON text file, which itself can be drag-dropped to 'load'.. thats a bit better than copy-pasting in a dialogue box). Still not enough for casual users though.

I just played a little bit with the drawing library we are using at the moment (fabric.js) to see how much work it would be to add different shapes (ellipse, polygon). Fortunately it isn't that much work. There is still a lot of room for improvement and it's far from being perfect, but I think it could already be useful at this stage.

Nevertheless, I think another browser based tool (which covers a different workflow) would be totally awesome. I imagine that there are some people out there that prefer the "traditional" (LabelMe) way of annotating pictures. If there is a tool for those people, it would be really great!

polygon_annotation

Thats definitely something I wish LabelMe had - an indication of activity, to re-assure you it's not a dead project

I think that should be easily doable :)

dobkeratops commented 6 years ago

Continued to experiment with my tool... it's feature creeping into a 'mesh editor that happens to use an image for reference', with the tools like subdivision/extrusion. Just seeing how much you can get away with in browser based code.

Even adding a propper z coordinate might be useful, e.g. distinguishing background/foreground (labelMe has an 'occlusion' flag, but what if several things overlap..), give indication of scale.. if you know several annotations are 'car', 'person' you have estimates of sizes ('roughly 2m high') from which a guess at 'z' could be made. with a propper order you know it could count on occlusion to figure out per pixel masks, i.e. no need to actually click out the 'sky' boundary, it's just everything behind 'buildings', etc.

annotating ground surfaces , I think a propper mesh is useful, e.g. road-curb-edge-pavement-buildings all have common edges. with an extrusion or subdivision approach you dont have to click those out again.. you work on the boundaries

perhaps I can seperate that all out as a library and keep an annotation/rotoscoping/photogrametry tool that uses it and keeps the UI focussed (i.e. one view which is locked around an image.

bbernhard commented 6 years ago

Sounds really good! Please let me know, if there are updates in your annotation tool - looking forward to play around with it :)

Don't know if it's easily doable, but if you could separate the logics/algorithms part from the UI part, then it would be really great. I could imagine that there will be some other tools in the future, which maybe use a different UI stack, but still will use the same logics/algorithms part. In case the logics/algorithms part is separated, one could clone the library separately and use it to create another tool :)

dobkeratops commented 6 years ago

Don't know if it's easily doable, but if you could separate the logics/algorithms part from the UI part, then it would be really great.

Yeah that's essential really as it grows. I need to look into how js splits code up. I realise there's startup time issues aswell (if you have a bloated page.. it'll take longer to load) , and we already talked about the benefits of task-focussed custom UI. I imagine a splitting it up into a few modules and we just happen to bring in the parts needed to for image tracing.

dobkeratops commented 6 years ago

Please let me know, if there are updates in your annotation tool

ok I updated it, https://dobkeratops.github.io/imageannotation.html

so now 'save' just triggers a download (gives you 'annotations.json'), and you can also drag that back in to load it.

(the mesh-manipulation tools are currently only accessible from hotkeys, the exact arrangement will change i.e. what modifier keys do etc.)

I'm partway through adding heirarchy (parts) support so it'll be able to represent the LabelMe format properly soon

bbernhard commented 6 years ago

Many thanks for updating! :)

I realise there's startup time issues aswell (if you have a bloated page.. it'll take longer to load)

Is that maybe what I have run into? (see attached GIF) Immediately after loading the page, the image gets blurry and I can see some debug messages in the Chrome console. Looks like as it is loading some annotations?

imageannotation

I'm partway through adding heirarchy (parts) support so it'll be able to represent the LabelMe format properly soon Awesome!

I am currently thinking about a proper API to support this new annotation workflow. I already have a vague idea, how it could look like. But before I'll start with that, I think we should have our new labels hierarchy in place (#47), as the internal label structure probably also affects how (fine granular) the API will look like.

dobkeratops commented 6 years ago

Is that maybe what I have run into? (

hmm not sure.. i had something blurring it but is supposed to be deleted its only supposed to load anything if you drag-drop onto it.

I've got the hierarchy support working now; the UI is only rudimentary (click 'info' to see, and 'ctrl-p' to activate a 'parenting/parts' tool). (I've updated again). The polygon elements are nestable. ('polygon' is really an 'object' in modeller terms, eventually I might give it a type, rename, then support actual 'splines', 'meshes', etc.). I still like 'labelMe's streamlining where there's an optional 'current parent' for newly drawn parts.. I need to think how to retrofit that over the fact i have a 'current poly' selection; you might not always want the 'current poly' to be the parent. (maybe it can check if you're drawing further away).

I envisage some of the 'parts' stuff to happen more naturally through subdivision (build a 'ground', split it into 'pavement, road..') .. but you're still going to want the simple ability to throw points or bounding boxes on the eyes etc.

I've only been running this on Safari(mac), I'll have to take a look at how it runs on other browsers

dobkeratops commented 6 years ago

Just added zooming in my tool (cursor keys , +/- to navigate.. I can't figure out if the browser lets you get the mouse wheel, i'm a fan of mouse wheel zooming..); tangentially it struck me that might be another option to prevent clashes 0 i.e. if 2 people zoom on different parts of the same image, their work won't interfere. you could still show updates in an overview (which is nice to have when zoomed in)

bbernhard commented 6 years ago

hmm not sure.. i had something blurring it but is supposed to be deleted its only supposed to load anything if you drag-drop onto it.

I tried it today again, and now it works. Awesome work! You are definitely on to something here. Most of the annotation tools I have tried so far, only have some simple bounding box functionality. It's really great to see something more advanced with hotkey support.

'ctrl-p' to activate a 'parenting/parts' tool)

I would consider using a different hotkey (for Windows). On Windows that key combination opens the printing dialog :D

Just added zooming in my tool (cursor keys , +/- to navigate..

Looks really good! :)

I can't figure out if the browser lets you get the mouse wheel, i'm a fan of mouse wheel zooming..)

I think there should be a "wheel" event, which you could listen to. (it should get you the delta, if I remember correctly)

zoom on different parts of the same image, their work won't interfere. you could still show updates in an overview (which is nice to have when zoomed in)

you mean cutting the image in pieces (either client side or server side) and serving the image piece to the user to annotate? Interesting idea! Although there is probably the possibility that you cut out some pieces that aren't annotable. But in that case you could always request another piece.

dobkeratops commented 6 years ago

I think there should be a "wheel" event, which you could listen to. (it should get you the delta, if I remember correctly)

ok I got that working. This was a big thing I wanted in labelMe.. 'zoom on point' eliminates the need for seperate pan/zoom controls , it's a very convenient way to navigate.

you mean cutting the image in pieces (either client side or server side) and serving the image piece to the user to annotate? Interesting idea!

that is also doable - infact that would be easier (well, more work in the server, but less UI) but what I imagined was still giving the user a sense of collaboration, showing the other people's work in near realtime .

You are definitely on to something here. Most of the annotation tools I have tried so far

I hope it can be of use , if not I enjoy having a fresh sourcebase in my head to experiment with . we'll see how things go with complexity / discoverability.. as you can probably tell from my suggestions I have a lot of potential tangents to explore.

I'm sure there are similar things more fleshed out already out there i.e. web-based modelling tools (especially with WebGL) ... but someone needs to actually adapt it to an image-based workflow (.. but I know this is something that happens in 'real' 3d packages.. artists building from reference)

'ctrl-p' to activate a 'parenting/parts' tool)

Yeah that's worth changing. in the end it should be possible to have something with both onscreen buttons and hotkeys , maybe like the microsoft ribbon idea where 'main buttons' bring up secondaries. (e.g. main:'draw', 'adjust' ..' -> if you pick 'draw' , the second half of the toolbox shows 'draw line'/'draw poly' etc.. 'adjust' would show the move, split, etc tools, and so on; but they could all show the hotkeys.

there's various different ways of doing point-manipulation too.. in the past I've preferred the tools which use LMB/MMB/RMB in an integrated way but it's probably better today to stick with something that revolves around 1 mouse button (more adaptable to a touchscreen) with the RMB being a context-menu (which could itself just be a shortcut to functions accessible on the ribbon). I'd like to try a pie-menu (like maya)

dobkeratops commented 6 years ago

update - I've experimented a bit with iPad/android multitouch support (pinch zoom navigation), and a 'lasso' drawtool which might be more comfortable for touchscreens (not quite as good as labelme masks, but in a similar 'freehand' spirit). I was thinking it might be nice to store a 'fuzziness'/'precision' parameter (see also 'metaballs') with primitives, then things clicked out roughly can be flagged as such.

The implication would be to apply a gaussian blur to the generated masks.

(what if a 'tap-the-squares' with quad-tree zoom refinement generated the same.. create a fuzzy vertex at each quad-tree cell centre .. I guess that would be even easier on the tablets )

bbernhard commented 6 years ago

ok I got that working. This was a big thing I wanted in labelMe.. 'zoom on point' eliminates the need for seperate pan/zoom controls , it's a very convenient way to navigate.

That's great! However, if possible, I would still consider adding some (optional) controls for mobile users and trackpad users. (e.q: I am most of the time working on my laptop, without a real mouse connected to it).

that is also doable - infact that would be easier (well, more work in the server, but less UI) but what I imagined was still giving the user a sense of collaboration, showing the other people's work in near realtime .

Aaah, now I get it. :) That's a really cool idea. Probably quite challenging to make that work in realtime, but when it's working it definitely could be motivating to see other people collaborate.

update - I've experimented a bit with iPad/android multitouch support (pinch zoom navigation), and a 'lasso' drawtool which might be more comfortable for touchscreens (not quite as good as labelme masks, but in a similar 'freehand' spirit).

awesome!

One thing I would also consider early in the process is: a responsive design. I know, that mobile and tablet devices are not your main focus, but for me personally that would be a killer feature. There are quite a few frameworks out there (Semantic UI, Skeleton, Less Framework..), which do the heavy lifting for you, so in the end it wouldn't be that much work to implement.

It would be really awesome if there would be the same set of functionality available on all devices (mobile, desktop, tablet..) with maybe some special modes for certain devices (e.q: "tap the squares" on touchscreen devices). It's most probably not the most productive way to do create some annotations on your mobile phone (as the screen is really tiny compared to a PC), but alone the fact that I could do it, would blow me away (if I would stumble accross your tool by chance). I think this could be a real powerful way to get people hooked to your annotation tool.

Yeah that's worth changing. in the end it should be possible to have something with both onscreen buttons and hotkeys , maybe like the microsoft ribbon idea where 'main buttons' bring up secondaries. (e.g. main:'draw', 'adjust' ..' -> if you pick 'draw' , the second half of the toolbox shows 'draw line'/'draw poly' etc.. 'adjust' would show the move, split, etc tools, and so on; but they could all show the hotkeys.

sounds good!

but it's probably better today to stick with something that revolves around 1 mouse button (more adaptable to a touchscreen) with the RMB being a context-menu (which could itself just be a shortcut to functions accessible on the ribbon). I'd like to try a pie-menu (like maya)

totally agreed.

dobkeratops commented 6 years ago

(e.q: I am most of the time working on my laptop, without a real mouse connected to it).

'mouse wheel zoom' is still trigged on a laptop, 2-finger scrolling does it I am also primarily using a laptop here.. making this better on a trackpad than LabelMe is a focus. I'm well aware of the 'relax on the sofa' workflow

I know, that mobile and tablet devices are not your main focus,

I'm going for a middle ground: "must be workable on the tablet, but accept that dedicated tablet-app would beat it, and dont hold the desktop back". (I dont expect a JS/canvas app to match iOS SDK/swift ..) . but just having seen the ease with which js/canvas stuff does work on iOS,Android,PC&Mac does actually inspire me to pay attention to it. (here I've got a linux desktopPC, MacBookPro, IPad, and 7" android tablet to try it on).

e.g:-

... and so on

(e.q: "tap the squares" on touchscreen devices).

I see 'tap-the-squares' as a touch-first design (which could of course still work on desktop) and probably separate to the current tool i'm building, i.e. doesn't need all the vertex/poly tools.. there is the quad-tree refinement idea to consider that could make it more interesting, but I'm not even sure we need that , as we could make the server look for areas of change and just present those as zoomed in jobs.

I also wonder if something might be possible somewhere inbetween the lasso, 'touch-the-squares', and 'label-me mask mode'. If I change my 'lasso tool' to a 'scribble tool' (storing a real 'edges/spline object' instead of polys; at the moment I just store those as 2-point polys), and have it attempt to 'flood fill' between the scribbles (in the manner I described on another discussion), that might be 'as easy to use as touch-the-squares', whilst still being something that could be precise.

If there's just a 'Blur-radius' stored with the polys, that could just be automatically just be much bigger when using a tablet. I'm just looking at how to get images in and out of canvas etc, I want to render the masks and save clips out or something as a test , which could then use blurred alpha channels*

There are quite a few frameworks out there (Semantic UI, Skeleton, Less Framework..), which do the heavy lifting for you, so in the end it wouldn't be that much work to implement.

at the minute i'm going off simplistic use of raw javascript and finding my way around the canvas API, I should take a look at some libraries (I seeing people talk about about 'jquery' etc) but I worry a bit about having to figure out someone elses event wrappers wheras here I build my concept of a 'tool' (with handlers) into the system, I can work how 'render-feedback' slots into it and so on. Having said that, when it comes to a propper 'object heirarchy view' I can see the UI libs would save a lot of time.

(* a nice gimmick, if there was an appropriately trained neural net somewhere, would be to use the annotations to change an image into layers: imagine cutting out the annotated objects, but fill in the holes with a NN trained on general images, so you could peel the layers back and see a reasonable guess at what's behind. I'm sure that's possible given what people are doing with GANs these days.

Thats the sort of 'toy' that might make the whole thing more interesting. Similarly imagine a 'reasonable guess at a 3d scene from a photograph', if you take the cutouts and place them like billboards with Z. I've got the thing setup as a 3d modeller with optional 3d views, even though the annotation task alone doesn't need it. )

bbernhard commented 6 years ago

mouse wheel zoom' is still trigged on a laptop, 2-finger scrolling does it

aaah...cool. wasn't aware of that - thanks for the info! ;-)

I'm going for a middle ground: "must be workable on the tablet, but accept that dedicated tablet-app would beat it, and dont hold the desktop back". (I dont expect a JS/canvas app to match iOS SDK/swift ..) . but just having seen the ease with which js/canvas stuff does work on iOS,Android,PC&Mac does actually inspire me to pay attention to it. (here I've got a linux desktopPC, MacBookPro, IPad, and 7" android tablet to try it on).

  • What it does do right now is changes the 'snap radius' & visible point size to be much bigger as soon as it detects touch-events;
  • I'm also happy to make the 'button layout' touch first (e.g. adding a 'cancel button' which on desktop I just take for granted as an obvious 'escape hotkey'). (what if the placement of the panel made 2-handed use easier)
  • another idea is the 'right-mouse-button' function can be replicated as an informative button , eg draw-poly tool uses RMB to 'close current poly', etc. Having the button still helps the desktop user by showing him what RMB does, so its win-win.
  • I made another tweak, in the 'adjust/edge-split' it was too easy to accidentally make new splits, so I made the snap-radius for actual splits smaller (carrying across both)

Awesome! :)

I am almost daily playing a little bit with your annotation tool. It's great to see how it evolves over time (and bugs that were there yesterday are fixed now ;)) and it gets better and better. :)

I also wonder if something might be possible somewhere inbetween the lasso, 'touch-the-squares', and 'label-me mask mode'. If I change my 'lasso tool' to a 'scribble tool' (storing a real 'edges/spline object' instead of polys; at the moment I just store those as 2-point polys), and have it attempt to 'flood fill' between the scribbles (in the manner I described on another discussion), that might be 'as easy to use as touch-the-squares', whilst still being something that could be precise.

As you mentioned the lasso tool: Photoshop has a function called "quick select" which is similar to the lasso tool, but more intelligent when it comes to selecting the surrounding pixels. I think Photoshop is still the only software that uses this feature (at least I haven't found it in an Open Source product yet), but I wonder, if that could also be used for image annotation.

It's also pretty hard to find information about the algorithm on the internet. The only thing I found is this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.212.9867&rep=rep1&type=pdf

at the minute i'm going off simplistic use of raw javascript and finding my way around the canvas API, I should take a look at some libraries (I seeing people talk about about 'jquery' etc) but I worry a bit about having to figure out someone elses event wrappers wheras here I build my concept of a 'tool' (with handlers) into the system, I can work how 'render-feedback' slots into it and so on. Having said that, when it comes to a propper 'object heirarchy view' I can see the UI libs would save a lot of time.

I am myself not a javascript master, but as you mentioned jQuery: The cool stuff about that is, that it hides some "dirty hacks" from you, that you often would need to implement to make certain things cross-browser compatible (browsers are not that good when it comes to sticking to web standards).

I also wonder if something might be possible somewhere inbetween the lasso, 'touch-the-squares', and 'label-me mask mode'. If I change my 'lasso tool' to a 'scribble tool' (storing a real 'edges/spline object' instead of polys; at the moment I just store those as 2-point polys), and have it attempt to 'flood fill' between the scribbles (in the manner I described on another discussion), that might be 'as easy to use as touch-the-squares', whilst still being something that could be precise.

Cooool idea...I wonder if that would work :)

(* a nice gimmick, if there was an appropriately trained neural net somewhere, would be to use the annotations to change an image into layers: imagine cutting out the annotated objects, but fill in the holes with a NN trained on general images, so you could peel the layers back and see a reasonable guess at what's behind. I'm sure that's possible given what people are doing with GANs these days.

That's a really great idea. :D