ImageMonkey / imagemonkey-core

ImageMonkey is an attempt to create a free, public open source image dataset.
https://imagemonkey.io
47 stars 10 forks source link

add former trending labels to label.graph #122

Open bbernhard opened 6 years ago

bbernhard commented 6 years ago

The following labels were made productive and need to be added to the graph.dot file:

This ticket will be used to track all the labels that were once trending and are now productive. As soon as they are productive, we should add them to the graph.dot file. Using a ticket for tracking those labels is just a temporary solution, until there are some better mechanisms in place.

As soon as there are some dedicated label graph maintainers, we should think about how we can notify the label maintainers about newly added labels.

Some possibilities:

Label graph maintainers can then add the labels to their dot file and create a pull request afterwards.

dobkeratops commented 6 years ago

back on the subject of labels, I was thinking about the generalisation of foo/bar again - 'head' might be a high value label because it's applicable to all animals (and a net could learn what's common between all heads - even without labelling rest of the animals..).

Would the overlap be enough? -if you've got a region with a big bounding box for 'dog' and a smaller overlapping bounding box for 'head' - those pixels will still have to give both classifications.

I do think "label-me's" hierarchy feature was great (drawing boxes around the extremities is easier than tracing the whole outline .. and gives some pose information)

With a label graph of course, there might be the ability to later refine ("what type of head - head/dog, head/person ..etc")- but you still get useful training information before refinement is done; I like the idea of being able to gradually improve the data incrementally

bbernhard commented 6 years ago

back on the subject of labels, I was thinking about the generalisation of foo/bar again - 'head' might be a high value label because it's applicable to all animals (and a net could learn what's common between all heads - even without labelling rest of the animals..).

Good idea! I guess it makes sense to make that one productive next. It has btw. already 192 occurences https://github.com/bbernhard/imagemonkey-trending-labels/issues/50 - so it's definitely a label that adds great value to the dataset :)

Initially I wanted to avoid broad labels like head, because I thought that more specific labels like dog/head would add more value to label. (if someone would label an image with dog/mouth we would know, that it's a mouth and that it belongs to a dog). I also thought that it might be easier for people to do the right thing in the annotation phase.

I could imagine that some people will start drawing a big bounding box over a person's head and a dog's head when there is a photo with a dog and a person sitting next to each other and the annotation task is: "annotate all: head". In that case it's probably also hard to refine that later. ("what type of head?" - it's a dog's head and a person's head). But I guess that's a general problem when one draws a big bounding box around multiple objects...:)

dobkeratops commented 6 years ago

a big bounding box over a person's head and a dog's head when there is a photo with a dog But I guess that's a general problem when one draws a big bounding box around multiple objects...:)

sure.. I can see this being a problem , the cases where a person is holding a dog etc. You have to encourage the user to indeed split them. If it was possible to do the heirachical grouping, that might make it clearer e.g. boxes around parts of the dog, then group those together.. I'm not sure what the easiest way to present all that is.

It might become clearer when there's an explicit option for head/heads

I was just trying to use my tool again after so long; even having written it the learning curve was steep :) in retrospect I would find it easier to draw the lower level bounds first, then select them and create a group holding them - but I coded it the other way round (draw the parent, draw the parts, pick the parent, then pick the parts)

Another idea would be to present the crops as seperate images (so at the top level you just annotate person, dog - and the system knows that they both have parts, so it presents them as new sub-images for part annotation..).. but that wouldn't have the same ability to annotate unlabelled animals, hmmm.

something even fiddlier (but more powerful) would be to show an approximation of the skeleton ...again in my tool I tried to make it show links, but I haven't tried it out with a full tree of components (i'll probably want a shortcut for instantiating the whole thing) No idea what the best way to present that would be..

dobkeratops commented 6 years ago

. It has btw. already 192 occurences

so i'd been bouncing between conventions.. "head of person", "head of man", and just plain "head". initially I'd thought X of Y was a clearer way of saying 'X' is a part of 'Y' (wheras X/Y can mean more general overlapping meaning) but then there's plenty of cases of names with "of", e.g. glass of water (that's not a glass part belonging to water!)

I wonder if you can retroactively filter those out to whichever you prefer

bbernhard commented 6 years ago

sure.. I can see this being a problem , the cases where a person is holding a dog etc. You have to encourage the user to indeed split them. If it was possible to do the heirachical grouping, that might make it clearer e.g. boxes around parts of the dog, then group those together..

something even fiddlier (but more powerful) would be to show an approximation of the skeleton ...again in my tool I tried to make it show links, but I haven't tried it out with a full tree of components (i'll probably want a shortcut for instantiating the whole thing)

like that one!

I was just trying to use my tool again after so long; even having written it the learning curve was steep :) in retrospect I would find it easier to draw the lower level bounds first, then select them and create a group holding them - but I coded it the other way round (draw the parent, draw the parts, pick the parent, then pick the parts)

awesome! I still think that a powerful (offline) annotation tool would be totally awesome. Every once in a while I stumble accross blog posts, where people write about how they trained a neural net with their own images. Nearly every time they talk about the tools they used to annotate the images, they mention the same dated tools that aren't maintained anymore. So I guess there is a definitely a need for a tool that makes the whole annotation process easier.

If I would start with the ImageMonkey annotation tool from scratch, I probably would strive for a looser coupling between the annotation tool and the backend. Currently, the annotation tool and the backend are tightly coupled...which makes it impossible to use the annotation tool for other applications. A cleaner separation with a nice API would definitely have been better, as other projects could have used it too.

bbernhard commented 6 years ago

so i'd been bouncing between conventions.. "head of person", "head of man", and just plain "head". initially I'd thought X of Y was a clearer way of saying 'X' is a part of 'Y' (wheras X/Y can mean more general overlapping meaning) but then there's plenty of cases of names with "of", e.g. glass of water (that's not a glass part belonging to water!)

That's indeed a tricky one.

I guess for the label graph, it won't matter much, as we can change the name later anyway. But you are right, it could indeed be important for the labeling step. Maybe it makes sense to add a short description to the label that makes the meaning clearer? The description could be shown if you hover over a label?

dobkeratops commented 6 years ago

a description might be good, another alternative is synonyms - along with the 'is a' / 'has' graph information that might be enough (also imagine finding the best visual examples for thumbnails)

"automobile .. isa vehicle has wheels door windscreen.. engine.."