insikk / Guide-CBIR

Guide to the Content Based Image Retrieval
41 stars 6 forks source link

Hello Insik #1

Open willard-yuan opened 6 years ago

willard-yuan commented 6 years ago

Hi Insik,

I'm Yong Yuan and I come from China. I find you are also focused on Image Retrieval and have done much work on CBIR. I'd like to discuss technical communication about CBIR with you. What's your IM software you use? Telegram?

insikk commented 6 years ago

Hi @willard-yuan. I don't use IM nowadays. If your topic is relevant to CBIR, we can discuss about it in this repo. Feel free to communicate through making issues here. I glad to see that more discussions about CBIR happens in this repository.

willard-yuan commented 6 years ago

It's a good habit. IM makes time wasted. I find you focus on local features such as SIFT (COLMAP/geometric_burstiness), DELF, correspondence matching and encoding local features to global feature. These are very important for CBIR, but CBIR is not matured in industry since methods at this stage still don't work well at large-scale dataset as the performance of CBIR such as MAP is low, especially instance retrieval/object retrieval/near-duplicate search. I tried to make something for CBIR, such as cnn-cbir-benchmark, But I find these means nothing to CBIR. We need explore method to bring CBIR to a big stage๐Ÿ˜†๐Ÿ˜†

keloli commented 6 years ago

Hi @willard-yuan , I did some research work on CBIR during 2017.2--2017.8. My image retrieval syetem still based on BOW model. I used 10 thousand images to build my dictionary ( SIFT ), and built kd-tree for faster retrieval. I also talk with a professor in CAS-ICT who got fancy result in imagenet. He told me that his team's best model also based on BOW instead of CNN. However, i still think the potential of convolution feature or other deep feature will bring a CBIR tast get a new stage. Hope for more communication~

willard-yuan commented 6 years ago

@keloli BOW is a good choice for middle-scale dataset, but it's hard to adapt very large scale dataset such as billion-scale. I have done a lot of experiment about BOW/VLAD/FV on small dataset such as oxford building dataset, and the result seems OK, but its performance decreases heavily on middle-scale dataset, such as on on million images dataset.

keloli commented 6 years ago

@willard-yuan Thanks for you advise. I hadn't experise on million images dataset. But indeed, it is important to experience together at a same dataset. Also, i don't know when we use a deep model, the precision of search result whether be decreasing ( some search engine just recommend similar pictures instead of the content-same pictures) . One of our lab's recent work is about training a network to judge whether the two picture is similar. May be this work will be a step for solvinng CBIR task by using deep model. Finally, could you provide a website to download such a middle-scale dataset ( like million images )?:smile:

insikk commented 6 years ago

@willard-yuan @keloli Nice to meet you guys. I am also interested in solving content-based image retrieval problem. The reason why I am studying BoW and local feature based work is I want to know what great ideas have been to solve the CBIR problem. I also observed that state-of-the-art method and top-models are using CNN based features. I made leaderboard page to track not only CNN based model, but also local feature based model.

I am not sure about large-scale image dataset. Oxford benchmark has been almost 10 years and almost solved considering SOTA model achieving >90% mAP. I wish some group publish new dataset both have large size and higher quality.

Good luck on your research!

keloli commented 6 years ago

Hi, guys! @insikk @willard-yuan I find an interesting competition: Google Landmark Retrieval Challenge at kaggle. This competition is a part of CVPR2018, maybe you will be interseted in it.

abhigoku10 commented 6 years ago

@willard-yuan @insikk @keloli hi guys , i am new to this topic i have queries . I appreciate a lot if you guys can answer my query 1.During my survey i found topic like image tagging , image retrieval , image captioning , CIBR whats the difference btw those 2.for My application is i need to analyze the image and generate word(tags) for that image 3.does this include training the image model using CNN and words using RNN or how is it to be done Thanks in advance

insikk commented 6 years ago

@abhigoku10 Thank you for having interest in this topic.

  1. I am not sure about image tagging. For the others:
    • image retrieval: From large image collection, fine related images from the given query image.
    • CIBR: image retrieval, but related image is defined so the result images containing the same object (instance) of query image.
    • image captioning: put description of the image in natural language.

Both CBIR and image captioning may need semantic understanding of the image.

2,3. You may want to read this: https://arxiv.org/abs/1411.4555. PDF: https://arxiv.org/pdf/1411.4555.pdf Show and Tell: A Neural Image Caption Generator Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

image

Start from here. And try to find recent papers citing this one.

Good luck on your journey~ :smile:

abhigoku10 commented 6 years ago

@insikk thanks for response one more question so can i use FasterRCNN, yolo and other networks for the detection . can you share any link which you would have gone thru which gives u the overview of stuff

insikk commented 6 years ago

@abhigoku10

Here are good place to start various modern detectors including FasterR-CNN and YOLO-sytle SSD. YOLO and SSD are not the same one, but quite similar.

Tensorflow code with paper

paper: Speed/accuracy trade-offs for modern convolutional object detectors Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, Kevin Murphy https://arxiv.org/abs/1611.10012

code: https://github.com/tensorflow/models/tree/master/research/object_detection

willard-yuan commented 6 years ago

Good work. ๐Ÿ™ƒ