introlab / find-object

Find-Object project
http://introlab.github.io/find-object/
BSD 3-Clause "New" or "Revised" License
447 stars 189 forks source link

Color independent object detection #21

Open manuschiller opened 8 years ago

manuschiller commented 8 years ago

Hi! Your Repo is awesome, i learned a lot about object detection by reading your code.

I am wondering if all descriptors are color dependent. While the logo with brown background is getting detected, the one with a black background is not: color dependent detection

I rather want to detect the shape of my logo, than the color contrasts so I can detect it in any combination with different backgrounds.

Is that possible with the detection algorithms used by find-object or will I have to use haar cascades for that?

Thanks, manu

matlabbe commented 8 years ago

Hi Manu,

The features are contrast dependent. They are computed in grayscale version of the image. Inverting black in white in a image would result in different features. Well, look at the paper of the features selected to know how robust they are to different contrast/illumination conditions.

You could have multiple occurrences the same object with different contrast/illumination. If one of he version in the set is detected, then the object is detected. In your example above, if one or the other is detected, your logo is detected. You may have intermediate contrast variations of the object if you want to be more robust to environmental illumination variation.

In some way, the haar cascades detector does a similar approach, training a lot of positive (with different contrast, scale, illumination,...) and negative images of the same object.

cheers

manuschiller commented 8 years ago

@matlabbe thanks for your response! Would you rather recommend using HAAR cascades or Feature Detection via KAZE / AKAZE for my use case?

Should the images be precomputed for better results? (eg remove background, change contrasts etc). I am aiming for a result similar to this video: https://www.youtube.com/watch?v=nzrqHqB-dLM

Currently i am reading through the papers to understand what the different parameters of the KAZE algorithm actually do, but it is quiet hard for me to get proper results.

I also noticed, that the image quality of the scene in which the logo should be detected plays an important role. If I have an image at let's say 2000 x 2000 pix the logos get detected easily. If I resize it to 500x500px the algorithms mostly fail.

Do you have any recommendations how to approach feature detection for best results?

Thanks a lot for your help!

Best, Manu

matlabbe commented 8 years ago

Hi Manu,

In the video they are using SIFT + Color descriptors. You may have similar results with SIFT alone. I don't know how KAZE performs in comparison of SIFT. The SIFT features are scale invariant, but there is a limit. The best may be to have the same object at different resolutions too. You can also modify the parameters of the detector to find features in lower resolution images.

As you can see in the video, when the object comes blurry, it lost detection. Motion blur can be another problem.

Maybe the best would be to try different approaches, and use the one that would fit the best for your project. Depending on what you are doing, maybe TLD tracker could also be used.

cheers

manuschiller commented 8 years ago

thanks! providing the objects with multiple backgrounds and in different resolutions did the trick for me. The only problem is, that is slows down the detection tremendously.

If I save the vocabulary of the objects, the file is pretty big (14MB). Is that normal?

Thanks a lot for your help, you helped a lot so far!

matlabbe commented 8 years ago

Hi,

If you use "File->Save vocabulary...", it saves the descriptors in plain text format (YAML). Depending on the vocabulary size and the descriptor's size, it can indeed take more space. For example, 550 visual words with size 128 (floats), it would output an ~1.3 MB file.

You may want to save the session in binary format instead: "File->Save session...". The example above would require 660KB instead (including the objects' images).

cheers