ImageMonkey / imagemonkey-core

ImageMonkey is an attempt to create a free, public open source image dataset.
https://imagemonkey.io
47 stars 10 forks source link

Few notes on label explosion and conventions #261

Open dobkeratops opened 4 years ago

dobkeratops commented 4 years ago

Some common “main labels” like: man woman child person - could be picked out with grep. I.e. any multi word free label containing “man” should be safe to reduce to “man” (Counter example man-o-war .. could be hyphenated . Manhole cover .. single word manhole)

You will find labels like “man sitting” “man riding skateboard” “Asian woman sitting” etc etc

“car” very nearly fits, there’s a few more counter examples: “Cable car” (doesn’t sit on wheels) “Train passenger car” (not a road car)

Simple Examples: “Parked car” “open top car” “vintage car” “luxury car” etc ... all safe to reduce to “car”

Perhaps we could focus on finding the counter examples to a rule like this, so the algorithm becomes “extract the biggest known labels first, then approximate to the simple labels found”

Some words might be extractable in isolation for properties: Verbs for action or posture: “sitting”,”standing”,”singing”,”walking”,”running”,”leaning”,”lying down”

“Lying down” is more complex because it’s not a single word, but I wanted to avoid the ambiguous word “lying” (Longer phrases reduce ambiguity. It’s not inconceivable that in future we want a database of facial expressions for a lie detector , heh)

Pairs of connected objects: Mostly I used a very “riding” without saying what (“man riding”, assuming there’s some visual similarity ie sitting in a saddle , hands connected to reins or handlebars)

But I’ve also experimented with compound labels like

“man riding skateboard”, “woman playing guitar” ...

The ensemble would be best activating a 50:50 blend of the two things , plus the verb? (I imagine a bunch of combinable outputs eg the object name, the action or posture, then some common adjectives ..) I have usually tried to annotate the person and object being ridden seperately.

Worst case - we could search for all the ensembles and cut them up

As these things move in the world as a connected object .. it seems worth considering them as an ensemble (esepecially “person riding bicycle”) .. of course the best thing to do is have both components labelled aswell but sometimes it saves time

Other uses of “of” Usually used for parts (“wheel of car” etc) but also: “pile of..” - like an advanced plural “pile of tires” , “pile of books” ...

“glass of water” “bowl of fruit” “Box of tissues” “plate of food” (again seems handy as a single object but we might want to decompose that)

bbernhard commented 4 years ago

on a related note: I've noticed recently, that we have quite a few cases where one polygon fits multiple labels.

e.g: Imagine a picture of a woman that has the labels woman and person. If you want to annotate both labels, you basically have to do the same work twice - once for the label person and then again for the label woman. I am wondering whether that's really necessary and if we can improve that somehow? Maybe add the possibility to clone polygons?

dobkeratops commented 4 years ago

on a related note: I've noticed recently, that we have quite a few cases where one polygon fits multiple labels.

e.g: Imagine a picture of a woman that has the labels woman and person. If you want to annotate both labels, you basically have to do the same work twice - once for the label person and then again for the label woman. I am wondering whether that's really necessary and if we can improve that somehow? Maybe add the possibility to clone polygons?

That’s where I thought label blending would really help - I’ve been using “or” , “and” for that lately, where before in the “add labels” view I would have used the slash. Cloning polygons as suggested would be another way. “Or” seems like a very safe seperator. I can’t think of any names yet actually containing or. So when finding or we can just split into two components and activate both. There’s also many cases where there’s several labels added speculatively and a newer label just obsoletes the previous one (cloning could be useful there as a way to acheive re-labelling?). In the unified mode you can pick the best label for each part, but in task mode these linger as redundant tasks. In groups of people you often need all 3, eg there’s some where you can’t quite tell gender, or you want to annotate a group (persons, plural) rather than picking out each one

Examples of blends I’ve been able to use:-

“trees or foliage”. - useful because some foliage is trees instead of bushes, and not all trees have foliage (winter). “Of” might also work here (“foliage of trees”,”foliage of bushes”)

“Trees or bushes” - subsets of vegetation excluding the type not mentioned (grass) “Bushes or grass” (excludes trees)

“Beach or sand” - sand exists elsewhere, and some beaches are pebbles or rock instead

“Grass or soil” - a way of describing patchy grass

“Path or road” - sometimes examples are unclear

“Sand or gravel” - mixed surface “Gravel or soil”

“Truck or van” .. cases where an object looks like a bit of both - intermediate types “Bus or coach” .. same

bbernhard commented 4 years ago

“Or” seems like a very safe seperator. I can’t think of any names yet actually containing or. So when finding or we can just split into two components and activate both.

That's a nice idea! I think that would work great in case one adds the label + the corresponding polygon in on go. Because then, we can split up the two labels in the background and add the same polygon to both of them.

But in case the label trees or bushes gets added first (e.g in the labels view) and the corresponding polygon will be added later (e.g via unified mode), we probably need a different mechanism to avoid duplicated work.

edit:

There’s also many cases where there’s several labels added speculatively and a newer label just obsoletes the previous one (cloning could be useful there as a way to acheive re-labelling?).

yeah, right. :) I think it probably makes sense to have both mechanisms. If the polygon is already available when we split up the label, we can use the polygon. If the polygon is added at a later point, the clone button can be used to avoid duplicated work.

bbernhard commented 4 years ago

I've noticed that the label x(https://github.com/bbernhard/imagemonkey-trending-labels/issues/858) and y (https://github.com/bbernhard/imagemonkey-trending-labels/issues/860) were added recently.

Were they added on purpose? (just to rule out that there is a bug somewhere hidden in the application that cuts off characters :))

dobkeratops commented 4 years ago

I've noticed that the label x Were they added on purpose? (just to rule out that there is a bug somewhere hidden in the application that cuts off characters :))

Yes it’s deliberate - it’s a temporary workaround (I hope these can be removed) , let me explain:

in unified mode you can’t yet browse “unlabelled” So I used “add labels“ to find interesting images from the unlabelled set , and just tagged them with a single letter for a batch; then I can go over them in unified (if I actually labelled them in labels , I wouldn’t be able to find them again , and I’d have to search for each label individually) Eventually I hope you’ll be able to just browse unlabelled images in unified. The recent unlabelled uploads have a lot of variety and they’re higher resolution than the images from the automatic scrape

bbernhard commented 4 years ago

aaah, got it. :D

I can remove them later, when we have a proper solution in place. That's no problem :)

Would browsing for unlabeled images solve that issue or would you prefer something more advanced? (e.g "star images"/"bookmark images"/"assign existing images to image collections" etc).

We already have the concept of image collections, so I am wondering if makes sense to use the concept to create "favorites". (e.g you "star an image" and it ends up in your personal "favorites" collection).

bbernhard commented 4 years ago

The downside of the image collection approach is probably that the image collections get bigger and bigger over time. So over time it gets harder and harder to find the newly starred images in there, between all the old ones...

dobkeratops commented 4 years ago

Would browsing for unlabeled images solve that issue or would you prefer something more advanced? (e.g "star images"/"bookmark images"/"assign existing images to image collections" etc).

Browsing unlabelled in unified would solve this specific use case completely. Browse pure random would also be of assistance

We already have the concept of image collections, so I am wondering if makes sense to use the concept to create "favorites". (e.g you "star an image" and it ends up in your personal "favorites" collection).

Possibly .. perhaps this could just slot into the collections system - you could have automatically generated collections of “my uploads”, “my annotations”, “my favourites” - or these uses could be covered by additional search filters (including “recent history” , “recent uploads” etc ?)

dobkeratops commented 4 years ago

Just wondered: browsing unlabelled for unified would need unified mode to handle the state where the label list is empty. Ideas...

On the plus side.. would something like this alleviate your concern about novice users .. ie these enhancements could explicitly guide you through adding a label then annotating it? .. it could interactively teach the unified mode better?

bbernhard commented 4 years ago

Like that ideas - many thanks for the feedback!

I am currently working on finalizing the "default image collections" [1] and after that I'll look into the issue above :)

[1] I've picked up the idea of yours and added two default image collections ("my uploads" and "my labels" - if you have better names, please let me know. I am always struggling with naming ;)).

Every time if you upload a picture, it will be added automatically to the "my donations" collection. If you add a label, it will be added to the "my labels" collection. The cool thing about the "my labels" collection is, that items disappear, once they have an annotation. (so maybe "my labels" is not the right word for that).

To give you a concrete example:

I've uploaded two images and added a bunch of labels via the labels view. If I now search for image.collection='my labels', I get the following result:

Selection_036

As you can see, it basically returns all the open annotation tasks. If I now click on the image with the label dog.has="nose"it opens the unified mode as usual:

Selection_037

If I now annotate every label there and then press the "Done" button, all the dog related images are marked as done (greyed out):

Selection_038

If I again search for image.collection='my labels', only the apple is shown (as all the other tasks are already completed):

Selection_039

But I can use the same search query (image.collection='my labels') with the radio button "Rework" checked, in case I want to see all the work I've done:

Selection_040

So the 'my labels' collection is basically just a filter for open annotation tasks (maybe it's better to rename it to 'my annotation tasks' or something similar?).

I think with that concept it should be possible to collect some interesting labels in the labels view and then work on them in the unified mode view. What do you think? Would that be a useful addition for your workflow?

dobkeratops commented 4 years ago

The “my labels” is interesting, perhaps it could be called “my pending tasks” (it does correspond slightly to the use case I had with single letter tags).

It might indeed be the case that some people prefer to setup tasks for the rest of the community; they could do that on their smartphone whilst they don’t have decent input devices at hand for annotation.

I suppose you could extend this to a “my annotations” aswell , although that isn’t as useful, it could be nice just to see your accumulated conttibutions

bbernhard commented 4 years ago

The “my labels” is interesting, perhaps it could be called “my pending tasks” (it does correspond slightly to the use case I had with single letter tags).

that's nice :+1:

It might indeed be the case that some people prefer to setup tasks for the rest of the community; they could do that on their smartphone whilst they don’t have decent input devices at hand for annotation.

something that I forgot: the annotation task is actually available to everybody, it will just end up in the "my pending tasks" collection of the user that added the label.

So e.g if you add the label apple, the apple task will be linked into your "my pending tasks" collection. But if I stumble across the annotation task and complete it before you've completed it, the task will also be removed from your 'my pending tasks' collection. So the 'my pending tasks' collection always contains tasks that nobody else has worked on yet.

bbernhard commented 4 years ago

The changes are now live. (i.e you can use image.collection='my open tasks' and image.collection='my donations' in the browse-based modes.)

dobkeratops commented 4 years ago

More notes on free label use age:-

I’ve done a few with a secondary object eg “man holding smartphone” “man riding skateboard” “man sitting playing guitar” etc, I was sort of hoping a parsing algorithm could take the first matched main label as the main one, and trailing ones could be ignored or turned into a property

Then I started using camelcase a bit.. I don’t like switching styles but wanted to leave the experiment in there. I figure those will be easier to extract and turn into properties (don’t have to rely on an algorithm to figure out the grouping) Firstly some states are multi word -camel case groups them as a single entity; secondly it could combine the examples of a linked object

Examples: woman lyingOnFront man walking wearingSuit

Other multi word states:-

sittingCrossLegged sittingOnFloor leaningForward leaningBack lyingDown (Lying is an ambiguous word .. maybe in future we want to label micro expressions heh) carryingBag holdingSmartphone

etc

dobkeratops commented 4 years ago

Ok I’ve been able to try this( browse collections), a few comments It’s still “flooded” ie when you create some new tasks, they’re lost in a long backlog; In my current use case , I’d need to sort by recent-first (ie you find something interesting , then go to annotate it whist it’s fresh in tour mind). So currently it’s still more efficient to use the single letter tag hack.

Besides that this is still just a workaround for the real solution: being able to browse unlabelled and click straight into unified mode, bypassing tasks completely - avoiding the need to find things twice

dobkeratops commented 4 years ago

Sone more recent usage .. again I don’t like changing styles but I went back to using a pc/keyboard for “add labels” (started using the same login as on my iPad) .. it’s easier to enter punctuation like slashes on the keyboard. I’m trying to give some examples of the properties & states but separated by slashes , eg “man/walking”, “derelict/building/interior” (“derelict” could be a property, ive seen derelict building, car, aeroplane etc in the dataset. “Interior” can apply to car,building,train... “Indoor” is specifically “building interior”)

I’m a bit nervous creating more work switching conventions. I did prefer using the slash to seperate (because we know it’s a definite boundary for parsing) but unified mode doesn’t support entering it, and it’s harder to enter on the iPad (hence using “of” and “or” instead). Perhaps they can be retroactively matched with explicit examples and converted over. Perhaps what we want is slashes where possible, and we’re using whatever rules we can to retroactively place them where we see other separators and compositions. would it be possible to get a text dump of all the label suggestions, to manually create a translation table?

bbernhard commented 4 years ago

Just had a look at the statistics...WOW! Awesome work!

I’m a bit nervous creating more work switching conventions. I did prefer using the slash to seperate (because we know it’s a definite boundary for parsing) but unified mode doesn’t support entering it, and it’s harder to enter on the iPad (hence using “of” and “or” instead). Perhaps they can be retroactively matched with explicit examples and converted over. Perhaps what we want is slashes where possible, and we’re using whatever rules we can to retroactively place them where we see other separators and compositions.

I wouldn't worry too much about the style. I think both styles are fine and not too hard to parse. There might be some clashes later (in case an image has both the "slash notation" and the "of notation"), but I think we could automatically merge those labels (and the corresponding polylines) together.

This endpoint lists all label suggestions: https://api.imagemonkey.io/v1/label/suggestions

Currently we have about 13k label suggestions and of those ~13k label suggestions we have 657 label suggestions that are used at least 20 times (https://github.com/ImageMonkey/imagemonkey-trending-labels/issues)

I'll also have a look to see whether I can allow the / in the unified mode..I think there's no technical restriction that prevents that :)

dobkeratops commented 4 years ago

(Managing labels - looking for “the most commonly occurring N=(1024,2048..)” would be better than having an incidence cutoff like 20) . There’s going to be plenty of garbage in the full list aswell though .. spelling mistakes and also some experiments on prefixes like “1 cat” etc ( done several ways)

dobkeratops commented 4 years ago

something else I remember now - There’s a couple of ambiguous words I started disambiguating in slightly unnatural ways(due to lack of brackets in unified) but I figured these could be aliased later: tank - “battle tank”, “tank military vehicle” etc versus “air tank”,”fuel tank”,”water tank”, ...(“tank container”) And “drum” - “musical drum” “drum musical instrument” versus “oil drum” etc

bbernhard commented 4 years ago

Unfortunately there's no API endpoint yet to filter for the number of occurrences. It's a bit hacky, but you could sort the github issues by "most commented desc" to get that information: https://github.com/ImageMonkey/imagemonkey-trending-labels/issues?q=is%3Aissue+is%3Aopen+sort%3Acomments-desc

But we should definitely add query parameters for filtering to the above API call.

dobkeratops commented 4 years ago

Even having that raw dump is neat, thanks . Imagine if we could get some number of those into graph format and then explore the image database through the label graph.

dobkeratops commented 4 years ago

1 unusual label example needing aliasing, car boot . We can't make 'boot' a component name (boot/car) because it's ambiguous between vehicle storage space and footwear. most of the time the 'boot' label appears its going to be footwear, but some explicit aliases could help ("hiking boot","football boot","wellington boot","boot (footwear)"

Americans call it trunk, similarly ambiguous with elephant trunk , tree trunk , hah. across the pond the USA and UK just agree "it has to be a highly ambiguous name" . https://en.wikipedia.org/wiki/Trunk_(car)