ImageMonkey / imagemonkey-core

ImageMonkey is an attempt to create a free, public open source image dataset.
https://imagemonkey.io
47 stars 10 forks source link

brainstorming: pure component labels - what do do? #271

Open bbernhard opened 4 years ago

bbernhard commented 4 years ago

At the moment we do have quite a few component-only labels in the database (e.g: head, leg, foot, wheel, etc). The question is now, what should we do with those labels? Currently, those labels are all "non-productive" - i.e you can only use the label when you're authenticated, non-authenticated users can't use the label. For me "making a label productive" means that the label is a good one (spelled correctly, no ambiguities etc) and that we want the broader audience to use the label.

With the component-only labels it's now a bit difficult, as there exists ambiguity - e.g a head can be a head/person, head/tiger, head/frog, etc. So just by looking at the definition of productive labels, I would argue that component-only labels don't qualify to get promoted to productive labels (as they are potentially ambiguous). But as we do have quite a lot of those component-only labels, I think it makes sense to discuss this more broadly.

How do we deal with those component-only labels?

should we

(my main concern here is, that I am not sure how useful that is in general? I guess most of the time one wants to train a neural net that e.g can detect a specific set of components - e.g: head/person instead of all sort of heads ).

I imagine a UI where all the "temporary labels" (with its polygons) are shown and the user can then select the appropriate polygons to move them to a different label. e.g: user selects all the head (person) polygons and moves them to the head/person label. (so it's basically a way to "re-parent" the component labels).

@dobkeratops what do you think?

dobkeratops commented 4 years ago

Overall I’m in favour of enabling the part labels for all, but given that logged in users can use it already it’s not an urgent need .. if in doubt you could delay unlocking till solutions are proven?.

I’ve been hoping that:

(1) the component only labels are still useful even in isolation from their owner - eg there are certain visual things in common between all heads or wings (even between aircraft wings and bird wings). All heads have eyes,mouths & are usually connected to a body by a neck or at least at one extreme of a body

It’s a different kind of ambiguity compared to a words like “tank” (which could be an air tank worn on a firefighters back, or an armoured military vehicle .. two completely different types of object)

(2)

(my main concern here is, that I am not sure how useful that is in general? I guess most of the time one wants to train a neural net that e.g can detect a specific set of components - e.g: head/person instead of all sort of heads ).

disambiguation could come from overlap, although that does add complexity for code preparing images for training. So someone who specifically wants to recognise head/person would have to filter for overlapping pixels .. (head AND person|man|woman, from two different polygons) Doing that at the pixel level is probably easy enough, but it’s still all extra code that needs to be written .

declare them as "temporary labels" and add the possibility to migrate them to fully qualified labels?

Right We could even imagine making the need for a disambiguating overlap (Or adding a species property) another task request, or search (“find orphan parts“)

(so it's basically a way to "re-parent" the component labels). When you word it like that it reminds me of the labelme system, which is more versatile database ,at the cost of more UI to write and teach.

I suspect it’s easier to write a filter for overlapping part+species annotations than it is to refine and teach all that UI (a tree view of labels ..)

There is of course the potential hazard of incorrect overlap (eg from a person hugging a cat.) If writing an overlap filter you might at least be able to flag those problems and either ask for a priority order (foreground / background or occlusion property?), or warm about it, or ask for explicit head/person etc in the images where such overlap occurs

(3) I’ve even wondered if the slash could function as a general label blend (I’m trying to use this more to reduce label explosion and express properties), with head/person slotting into the same system that handles blending like grass/soil (“paint a 1 into the channels for both a/b”) . If we want to normalise we can blend parts differently. There’s possible uses for part blending aswell, eg fin/wing. Normalise all the part and non part channels independently maybe

bbernhard commented 4 years ago

Overall I’m in favour of enabling the part labels for all, but given that logged in users can use it already it’s not an urgent need .. if in doubt you could delay unlocking till solutions are proven?.

Yeah, that's pretty much what I am doing at the moment ^^. I am still hoping that at some point cwe ome up with a clever idea that solves all the label structuring challenges and problems. :)

(1) the component only labels are still useful even in isolation from their owner - eg there are certain visual things in common between all heads or wings (even between aircraft wings and bird wings). All heads have eyes,mouths & are usually connected to a body by a neck or at least at one extreme of a body

totally agreed - I also think that component only labels are quite useful and if used correctly, they are also pretty powerful.

However, what worries me a bit is, that inexperienced users might be using the generic component label soley, without knowing that there also exist more specific versions of that label (e.g head/person). My hope is that we can somehow encourage users to use the more specific labels (if possible) or at least give them the possibility to "transfer" component labels to more specific labels.

I think the more fine granular we have the data structured (with label + properties) in the dataset, the more of the data querying complexity we could offload to the database. Of course, we could also chose a completely different approach and say that data aggregation is not our main focus and offload that to the user. So basically something similar as you've proposed with the pixel scanning. I think that could also work. I guess the potential drawbacks of that solution are, that the user then probably needs to download a lot of data (images + polygons) in order to do the offline processing and then needs to manually verify the filtered data again to check if the data is sane before he can start the training process.

Personally, I would really prefer if we could somehow store that information in the database. There are probably cases where users want to train a neural net on some information that's so specific that we do not have it nicely aggregated in the database. In that case they have to do some manual offline pre-processing, before they can use the data. But my hope is that we can store as much information as possible in the database, giving the users the possibility to write complex search queries (so that manual pre-processing will become obsolete in most cases)

I totally understand that this is a pretty ambitious goal..and to be honest this could also easily be one of those "bite more off than you can chew" moments. ^^ I don't think that component only labels are necessarily bad (as you already mentioned, they can be great to learn a neural net new features, they are faster/easier to type when labeling, etc)...it's just the feeling that it won't take much effort to make them even better. When we find a way to "transition" component only labels to "fully qualified labels" we could re-use all the existing polygons, which could really save us some time and effort :)

dobkeratops commented 4 years ago

or at least give them the possibility to "transfer" component labels to more specific labels.

Right - you’ve already got a quiz mode I think (although i don’t use it) - maybe this could be generalised to specialise in different ways (I had always imagined the graph allowing the label list to get gradually deeper as the database grows , and each label that gets multiple offshoots in the graph would be a candidate for a specialisation quiz)

All a question of priorities .. balancing the risk of leading users astray , versus the extra annotations they could make for otherwise unknown animals/vehicles , versus the UI programming burden of getting a “best of all worlds” solution .

I’m just one user so you might not get a representative idea just from how I’ve done things. There are a few times when I just add head (unknown animal types) , and those might be findable through a search (“head&~(person|animal)” if there was an animal graph node to group all those , otherwise writing a longer expression with all the existing animal names could get it)

Personally I think most people would gravitate toward labelling the owner first (more likely to think “ I see a person” rather than “I see a head”) but I have no data to confirm that