Closed prachii-15 closed 1 month ago
10% Penalty for late submission.
Proposed project: UI -> Image upload -> OCR -> Search on noisy text UI + OCR not a contribution to IR project. Retrieval based on noisy OCR text is a valid problem, assuming there is a dataset available to experiment with this.
The goal of our project is to create an image retrieval system using deep learning(CNN & LSTM), focusing on two primary input methods: natural language descriptions and image queries. The system allows users to search for relevant images either by typing a description or by uploading an image.
For this we use Flickr8K dataset, which contains images and associated captions.(https://www.kaggle.com/datasets/adityajn105/flickr8k)
BLEU for evaluating retrieved images makes no sense. Seems like blindly copy pasted proposal.
Evaluation Mean Reciprocal Rank (MRR): Measures the rank at which the first relevant image appears in the retrieved results. Mean Average Precision (MAP): A comprehensive metric that takes into account the ranking of relevant images across the entire retrieved set. Normalized Discounted Cumulative Gain (NDCG) NDCG is a metric used to evaluate the ranking quality of the retrieved results, like images. It gives higher scores to relevant images that appear earlier in the ranked list and penalizes relevant images that appear later.
@Satyammaravaniya Can you explain further how MAP and NDCG will be calculated? Does the original dataset have relevance judgements? Are these relevance judgements binary or ranked? Please provide a link to the dataset having such relevance judgements.
Sir, we have changed our project, so we will upload the new project on GitHub.
Closing this as the proposal is not relevant anymore
Title
Text & Content Based Image Retrieval
Team Name
DPRS
Email
202318008@daiict.ac.in
Team Member 1 Name
Prachi Mehta
Team Member 1 Id
202318008
Team Member 2 Name
Dhruvi Mehta
Team Member 2 Id
202318003
Team Member 3 Name
Riya Dave
Team Member 3 Id
202318011
Team Member 4 Name
Satyam Maravaniya
Team Member 4 Id
202318026
Category
New Research Problem
Problem Statement
he goal of this project is to create an image retrieval system using deep learning with CNNs for visual features extraction and LSTMs for caption generation. This system will allow users to search for images using either a natural language description or by uploading an image to find similar results. Using the Flickr8K dataset, the model aims to enhance retrieval accuracy by matching the semantic content of text and image inputs with the dataset. The system will have a Graphical User Interface (GUI) as well as a command-line interface, providing versatile access to a wide variety of users. By merging classical image retrieval methods with sophisticated deep learning models, this project offers a user-friendly and efficient solution for searching images using text or visual content.
Evaluation Strategy
BLEU (Bilingual Evaluation Understudy) scores, which are common metrics for evaluating the quality of text generation tasks like machine translation or image captioning.
Dataset
https://www.kaggle.com/datasets/adityajn105/flickr8k
Resources
Paper Title : Text-based, Content-based, and Semantic-based Image Retrievals: A Survey, Paper Link : https://www.researchgate.net/publication/273258916_Text-based_Content-based_and_Semantic-based_Image_Retrievals_A_Survey
Paper Title : Content-Based Image Retrieval Research Paper Link : https://www.researchgate.net/publication/257706512_Content-Based_Image_Retrieval_Research