brainstorming/analogy - painting process

Basically annotation is a painting task.

The recent tweaks have made many cases much easier (hence see the improved activity chart :) ),

I can explain clearer now what was wrong originally, and where to go to improve further:-

the original workflow is like being told to paint images one colour at a time "here's a red brush.. here's a random image.. draw all the red. now here's a green brush.. here's a different random image.. draw all the green ..etc"

Now with the browser mode, at least you pick one 'brush' and find the images where it's ok ... hence success with road, pavement (clear areas) - and of course the original single-subject stock photos were probably fine too.

The "add labels" mode works ok because it's more like asking "what palette does this image need?" .. that's often doable.

But the best way to handle an image is to see the context and gradually refine it.. (i) pick the predominant areas, block them out ("start with the mid-tones and broad brush shapes"), then refine by clipping details out.(add the highlights and shadows.. refine the boundaries..). the optimum order depends on the size and specific shapes (zoomed in, you clip out what isn't the subject.. but zoomed out, you start with the background then draw the subjects)

Saying "the unpainted area is a certain set of labels" might work ok, because this is like a background fill. (see also the idea of an 'unoccluded sky' label, and 'anti-label'- draw all the 'not-tree' in a forest). imagine being able to plan the task for an image, e.g. "we will start by filling with {this mix..}, then paint over {these details..}". (the idea of saying "un-annotated pixels are a mix of the unused labels" is kind of like that but more dynamic). I think that's also where I was going with the requests for "sparse/dense, few/many, foreground/background" label qualifiers: Those would hint whats the background fill, and what tasks are optimum to do first (related idea in rendering.. https://en.wikipedia.org/wiki/Painter%27s_algorithm, a layering order refines the boundaries). see #149 - the need for arbitrary expansion of the label list: drawing any object for occlusion is useful information.. waiting for labels is like saying "we can only draw certain pieces". it's often easier to draw the occluder rather than the precise boundary of a part you can see. e.g here's a peice of pavement/building, but to precisely identify the pixels today i'd have to trace around the litter bin, because there's no litter bin label yet. screen shot 2018-06-27 at 09 07 29 . It's fewer clicks to draw the broad area for the pavement & building, then the occluding object infront... (more occluding 'containers'. imagine being able to draw those in.. rather than tracing the outline around them

)

Going back to the label/color analogy, the label list needs to be exhaustive - imagine trying to paint without all the primary colours.. (e.g. if there's no label "litter bin", a broader label "container" could still cover it)

(see also per-label occlusion hints, e.g. it's probably ok to assume certain labels(ground-types) occlude others (moveable object types), but over-ride might be useful)

Telling what colours are can be difficult in the real world because of relative perception.. labels aren't far off, because many shapes in the distance aren't clear, and there's overlapping categories.. but sometimes once you figure out surrounding context, those difficult parts get easier.

From this POV i'll re-iterate the idea of general label blending #141: you do get areas where it's optimum to mix '2 colours' (2 labels).. this is especially true of any organic parts (e.g. mixture of soil, bushes, grass, fallen leaves,bark,moss) but also anything moving. You sometimes get vegetation growing between paving stones (so a label "pavement/grass" would be optimal). see also #142 #84

Then there's bits of road covered in gravel around a construction site ("gravel/road"), moving crowds, trash or litter (cleaning bot..)

Here's a nice example of complex ground:-

you might start with a blend 'concrete/soil/grass/gravel' to fill the floor,("fill with a grey/green/brown random texture") then clip out the areas which are clearly one or two types ('the clear patches of grass, clear patches of concrete, etc'). as I'm looking at it now i'm still finding it hard to guess what the distant areas are (dry soil, or gravel, or dirty concrete?). It looks mostly concrete.. but the presence of grass implies other parts must be soil. regarding distant vegetation, you might say 'grass/foilage' (for the bushier parts)

heres another experiment in GIMP, different workflow idea.. (i) draw major boundaries as black outlines in a layer (ii) flood-fill between them with color coding

there's no overlap and it's limited to how ever many colours you can easily distinguish, e.g 7 ish (unless you split into more layers, in which case you might aswell go back to channels)

it's easy to draw the boundaries, but you occasionally miss parts you can set up the layer opacity to see what you're doing it is easier to flick between a few categories in the colour palette.. there's less commands to use

plus points.. easy use of paint tools/tablet, the work done for each edge is shared between adjacent regions, the data is just a second image. minus - can only place limited labels with colour coding, and can't describe occluded parts.

Wow, many thanks for the feedback - very much appreciated! I need to think a bit more about how we can add the concept of label blending to the current system, but I can definitely see it's advantages.

What I am wondering is: Do we need to define the blent-together labels already in the labels view or can we leave the decision whether labels will be blent together up to the annotator?

Currently, we have a pretty straight forward concept of: label first, then annotate. I think for "normal" labels that decision makes sense, but for blent together labels I am not really sure.

I guess sometimes it's up to the annotator if he wants to combine labels or not (sometimes you have to combine, as there is no clear distinction between objects). And as the labeling guy and the annotator aren't necessarily the same, I am wondering if it makes sense to define the label blending already on label level. Or is it better to stick with base labels and let the annotator do the label blending?

I know, that this doesn't fit 100% in the task based approach (with the clear separation of concerns) that we have right now, but I think this could be a more natural approach.

edit: I only considered the "there-is-no-clear-separation-between-objects-and-that's-why-I-would-like-to-use-label-blending" use case - which is probably something that's up to the annotator and not the labeling person? But are there also cases, where you could annotate objects separately, but you actually want to annotate them in conjunction? (I guess for that cases the usual workflow: label first, then annotate could be better?)

Do we need to define the blent-together labels already in the labels view or can we leave the decision whether labels will be blent together up to the annotator?

i've added quite a few foo/bar examples in 'add labels' where i could see it would make sense - but at the same time, I think sometimes it would be great when annotating to see related labels (and existing annotations). like with 'person,man,woman' .. toggling between those to annotate all the people would probably be handy

it might just be enough to let the user know the related labels exist (and ideally see their annotations in a feinter shade perhaps) for context ("ok i'll annotate grass, but i know this mixed area can be done as a different grass/soil task later")

the optimum ordering would depend on context.. that's probably edging toward the combined task (imagine if you could draw the main one first then overlay the clear regions of another type)

I guess it also depends how often one runs into the situation of needing a blent together label. If it happens quite often, I could imagine that it's better to "plan ahead" and define the label already in the label view. (I guess it's also easier there, as you don't need to switch between keyboard and mouse) But on the other hand, it probably also depends a lot on which device you are working - if it's a laptop with trackpad or even a smartphone you probably need label blending more often, whereas on a computer you could annotate objects separately most of the time.

it probably also depends a lot on which device you are working

definitely.. the difference between using a laptop, and using a laptop+wacom is massive.

having said that, even with the wacom many blended areas remain (e.g. the seperate parts being few pixels. this can be either in the distance, or in intricate surfaces

ImageMonkey / imagemonkey-core

brainstorming/analogy - painting process #156