Cartucho / OpenLabeling

Label images and video for Computer Vision applications
Apache License 2.0
918 stars 265 forks source link

New Features Discussion #3

Open Cartucho opened 6 years ago

Cartucho commented 6 years ago

The purpose of this tool is to make labeling as easy and fast as possible.

Initial ideas:

Discuss here your opinions.

MattKleinsmith commented 6 years ago

Video object tracking and feature matching / homography would increase labeling efficiency a huge amount for those labeling frames from a video. You could label one frame and then get 30 frames labeled for free, for example.

I don't know what "Use pre-labeled images from Yolo v2" means. Does this mean use Yolo v2 or another pretrained network to pre-label the images, and then allow the user to resize or correct those labels, maybe increasing labeling efficiency? I'm unsure about this one because I think the screen might be cluttered, the modifications might take just as much work as pure labeling, and one is limited to the classes of the pre-trained network for pre-labels. I'm attracted to this theme though: using pre-trained models to increase labeling efficiency. I think it can be done somehow. And now I remember this video: https://youtu.be/t4kyRyKyOpo?t=13m3s (the labeling idea is shown from 13m3s to 14m3s). The idea in this video is classification-based, not bounding-box-based, but many bounding-box networks treat the problem like a classification problem via considering many regions.

Superpixel Segmentation: this one seems to be good for many project types. I think with this, one could convert a single click into a rough segmentation, and then automatically turn that into a bounding box, for bounding box projects. And for segmentation projects, one could click a few more times to get a precise segmentation.

Grabcut segmentation: this one seems focused on segmentation projects at the expense of bounding box projects. I value superpixel segmentation more, since it's more flexible.

I'm working on a bounding box project that involves video, so I'm biased towards video object tracking and feature matching / homography.

Cartucho commented 6 years ago

@MattKleinsmith What do you think of changing the name of the repo to: a) SmartLabeling b) OpenLabeling

The goal of this tool is independent from Yolo so I think it should be changed. And it's better to change in the beginning before starting to get references.

MattKleinsmith commented 6 years ago

Between a and b I prefer b. It seems more welcoming.

gmanolak commented 6 years ago

is there a way to erase a box surrounding the object.

Cartucho commented 6 years ago

@gmanolak double click to select the bbox, and then click the x.

rcabg commented 5 years ago

Hey @Cartucho and @MattKleinsmith,

I'm working on the video tracking feature. I have something but I need some feedback before continuing. Should I create a pull request and discuss there?

Cheers!

Cartucho commented 5 years ago

Hello @rcabg that would be great! Please make it a separate PR.

I have also made a draft version so we could merge them together and hopefully find a way for adding that feature.

vuthede commented 5 years ago

I have implemented "click-drag resize" the bbox, instead of "quick delete" -> "draw new one" as the current version. I found it is useful in a case: when the tracker create not-exact bboxes for some frames, we could resize the bbox again without deleting them. Because delete method currently caused later frames be deleted also. If you guys feel it is useful, I will clean a code a bit and make a pull request.

Cartucho commented 5 years ago

Yes! That was on my TODO list haha! great! Make a PR and I will test and help you out.

vuthede commented 5 years ago

Yeah, i will clean code a bit and will make a pull request! Have u planned to integrate the Tracker with state-of-the-art deep learning object detection model to help us reduce manually labelling as much as possible. Like we may have some very long video(a whole day for instance) and we want to let the program label automatically for us. Then we can re-label after that. This may help us reduce much of time for labeling i think.

Cartucho commented 5 years ago

Yeah, I agree with you although we would need to try its usability. As it was previously argued here, it may turn out that having to fix automatically generated labels might be as hard as pure labeling.

There are great deep-learning trackers, better than the ones that are currently implemented in OpenCV. If we used a state-of-the-art tracker it would improve a lot the predictions for each object's bounding box in a video.

vuthede commented 5 years ago

I have made a pull request of click-drag-resizing. Would you mind to take a look?

Cartucho commented 5 years ago

@vuthede Now with the resizing we can improve a lot the video tracker (so that when one resizes it re-adjusts the other associated labels).

Also, another great feature would be if we allowed the users to label with a single click and drag.

vuthede commented 5 years ago

Yeah, I thought about the first one too, Could u clarify label with a single click drag?

Cartucho commented 5 years ago

Instead of having to click twice per bounding box, e.g.:

  1. click: left-top
  2. click: right-bottom

the user could click only once and drag the mouse, e.g.:

  1. click : left-top
  2. move mouse with click 1 still pressed
  3. release click 1.
vuthede commented 5 years ago

Oh yeah, thanks. I understood. If u plan to do other 's features. I think I can help u finish some features you have just mentioned.

Cartucho commented 5 years ago

Also, another cool thing to add would be the option between 1. rectangle (object detection) versus 2. pixel labeling (image segmentation).

Zuzuske commented 4 years ago

It would be cool if there would be an option to choose from what file draw_bboxes_from_file method draws the boxes. .xml or .txt

Cartucho commented 4 years ago

@VytautasDV one of the users said he would submit a PR for this #59

zhangqinghao0811 commented 4 years ago

Both of moving box bound key and dragging image key are mouse right key, it's clashes. I think using mouse middle button to drag image is more convenience.