interesting feature in LabelMe- object component hierarchy

dobkeratops commented 7 years ago

Playing around with label me a bit more, it does have a useful feature which is the ability to assemble the labels into an hierarchical tree . Originally I didn't bother using it.

it has 2 ways to do it: in the list of labels (shown on the right side), you can drag-drop to make one part of another; Alternatively (and more conveniently), once you've created a label, you can click 'add parts' instead of 'done', and all subsequent labels are automatically added to that (the 'current parent label' is displayed in bold in the list).

Something interesting thing about this of course is making the component names more accurate: car, bicycle both have 'wheels' ... there's no longer a need to separately name 'car wheel', 'bicycle wheel'.

One thing I wonder is if this could be used to refine the outline, without poly editing, but with 'more accurate component specifications'. e.g. it's very fiddly to draw a polygon outline (especially on a laptop) .. conversely you can go in and label car components ('headlights, bumper, roof, bonnet' etc..) and group them together; imagine if it dropped from a bounding box to a bounding convex poly (and you could just add blobs (including unlabelled) to keep the overall outline.

Another idea: maybe the concept of 'components' in the label database could be used to hint this more; (if your JSON file tells it what objects have potential components .. and if you create a label with a name that's a component, it could prompt you 'which object does this belong to'

bbernhard commented 7 years ago

Really cool ideas, thanks a lot!

I'll refer to the other points later, but this idea of you got me thinking:

imagine if it dropped from a bounding box to a bounding convex poly

I am a total beginner when it comes to image processing, but I remembered vaguely that OpenCV contains some real cool things when it comes to object detection. So I sat down for a few hours today to implement a "quick and dirty" proof of concept with Python and OpenCV.

What it basically does:

user selects an area of interest with the mouse
OpenCV's grabcut algorithm is used to segment image
find all contours in image and grab the largest one (that's most probably the contour we are interested in)
get convex hull of contour

The following two GIFs demonstrate that:

outline1

outline2

The main disadvantage that I see is, that we have to offload the generation of the polyline to the server (i.e we don't get any snappy realtime processing). Another disadvantage is of course that it probably won't work equally well on every image. (which can already be seen in the football GIF). But maybe we can tweak that further?

But not sure if it makes sense to invest time in implementing such a thing or if it's a waste of time?

Another cool thing that I really love is Photoshop's "quick selection tool". If you are not familiar with it, here is a short demo video: https://www.youtube.com/watch?v=Nyr9dI8Zac8

Unfortunately I couldn't find much information on the algorithm online. The only thing that I found is this paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2009/08/PaintSelection_SIGGRAPH09.pdf

dobkeratops commented 7 years ago

does look pretty good, although the lower example illustrates how it can become ambiguous. i've not looked into openCV. There's examples like picking out a car from traffic.. nonetheless what LabelMe does have is another mode where you scribble inside, and it tries to fill out (again, it can still make errors but it's interactive, you can also use an erase tool to refine). I think that kind of tool works best with a stylus really

r.e. the server, I'm sure there'd be a viable workflow, e.g. you submit the bounding box, and you could have a whole extra mode to validate ('is this the object..')

I also wonder if other methods could be used to enhance object boundary detection: e.g. imagine using video (with extra edge detection in the velocity field) to train a neural net that's better at discerning object boundaries, without necaserily knowing what the objects are.. with ideas like that (and openCV) in the background I think its ok to just accumulate rough bounding boxes at first

I suppose one little tweak might be to draw a bounding box with a crosshair, and if you know the crosshair is in the object centre there's a higher chance of guessing the correct boundary later (cases of extreme occlusion .. the crosshair might be on another object, so you know you really need another approach

bbernhard commented 7 years ago

I also wonder if other methods could be used to enhance object boundary detection: e.g. imagine using video (with extra edge detection in the velocity field) to train a neural net that's better at discerning object boundaries, without necaserily knowing what the objects are.. with ideas like that (and openCV) in the background I think its ok to just accumulate rough bounding boxes at first

That might indeed be a promising approach :)

regarding videos: If we consider using videos as data source we could also have a look at background substraction. I haven't tried it yet, but this [1] looks like a really well designed C++ library that implements several different background substraction algorithms. I think there are many public traffic cams out there, similar to this one: which could serve as data source. It might be worth a try to run some test videos through it to see if we can use the output to create bounding rectangles automatically.

[1] https://github.com/andrewssobral/bgslibrary

dobkeratops commented 7 years ago

that would do it. video -> background subtraction-> edge-detector on the 'foreground mask' -> train a net to figure out the 'foreground-mask-edges' from the corresponding static images -> then use that as a means of extracting objects in any other images. another angle would be too train something using stereo images

bbernhard commented 7 years ago

I like the idea. I think I'll create a seperate ticket for that...could be worth a try :)

btw: just stumbled accross the LabelMe Annotation Tool Sourcecode. Wasn't aware that it is also Open Source. https://github.com/CSAILVision/LabelMeAnnotationTool

ImageMonkey / imagemonkey-core

interesting feature in LabelMe- object component hierarchy #32