Text & Content Based Image Retrieval

prachii-15 commented 2 months ago

Title

Team Name

DPRS

Email

202318008@daiict.ac.in

Team Member 1 Name

Prachi Mehta

Team Member 1 Id

202318008

Team Member 2 Name

Dhruvi Mehta

Team Member 2 Id

202318003

Team Member 3 Name

Riya Dave

Team Member 3 Id

202318011

Team Member 4 Name

Satyam Maravaniya

Team Member 4 Id

202318026

Problem Statement

he goal of this project is to create an image retrieval system using deep learning with CNNs for visual features extraction and LSTMs for caption generation. This system will allow users to search for images using either a natural language description or by uploading an image to find similar results. Using the Flickr8K dataset, the model aims to enhance retrieval accuracy by matching the semantic content of text and image inputs with the dataset. The system will have a Graphical User Interface (GUI) as well as a command-line interface, providing versatile access to a wide variety of users. By merging classical image retrieval methods with sophisticated deep learning models, this project offers a user-friendly and efficient solution for searching images using text or visual content.

Evaluation Strategy

BLEU (Bilingual Evaluation Understudy) scores, which are common metrics for evaluating the quality of text generation tasks like machine translation or image captioning.

Dataset

https://www.kaggle.com/datasets/adityajn105/flickr8k

Resources

Paper Title : Text-based, Content-based, and Semantic-based Image Retrievals: A Survey, Paper Link : https://www.researchgate.net/publication/273258916_Text-based_Content-based_and_Semantic-based_Image_Retrievals_A_Survey

Paper Title : Content-Based Image Retrieval Research Paper Link : https://www.researchgate.net/publication/257706512_Content-Based_Image_Retrieval_Research

parth126 commented 2 months ago

10% Penalty for late submission.

Proposed project: UI -> Image upload -> OCR -> Search on noisy text UI + OCR not a contribution to IR project. Retrieval based on noisy OCR text is a valid problem, assuming there is a dataset available to experiment with this.

Riya-Dave1 commented 2 months ago

The goal of our project is to create an image retrieval system using deep learning(CNN & LSTM), focusing on two primary input methods: natural language descriptions and image queries. The system allows users to search for relevant images either by typing a description or by uploading an image.

For this we use Flickr8K dataset, which contains images and associated captions.(https://www.kaggle.com/datasets/adityajn105/flickr8k)

parth126 commented 2 months ago

BLEU for evaluating retrieved images makes no sense. Seems like blindly copy pasted proposal.

Satyammaravaniya commented 2 months ago

Evaluation Mean Reciprocal Rank (MRR): Measures the rank at which the first relevant image appears in the retrieved results. Mean Average Precision (MAP): A comprehensive metric that takes into account the ranking of relevant images across the entire retrieved set. Normalized Discounted Cumulative Gain (NDCG) NDCG is a metric used to evaluate the ranking quality of the retrieved results, like images. It gives higher scores to relevant images that appear earlier in the ranked list and penalizes relevant images that appear later.

parth126 commented 2 months ago

@Satyammaravaniya Can you explain further how MAP and NDCG will be calculated? Does the original dataset have relevance judgements? Are these relevance judgements binary or ranked? Please provide a link to the dataset having such relevance judgements.

dhruvi-m commented 2 months ago

Sir, we have changed our project, so we will upload the new project on GitHub.

parth126 commented 1 month ago

Closing this as the proposal is not relevant anymore

parth126 / IT550