parth126 / IT550

Project Proposals for the IT-550 Course (Autumn 2024)
0 stars 0 forks source link

Faster Maximum Inner Product Search in High Dimensions [Reproducibility] #12

Open ExhoParth opened 2 months ago

ExhoParth commented 2 months ago

Title

Faster Maximum Inner Product Search in High Dimensions [Reproducibility]

Team Name

VectorForge

Email

202311033@daiict.ac.in

Team Member 1 Name

Parth Borad

Team Member 1 Id

202311033

Team Member 2 Name

Sarthak Vadher

Team Member 2 Id

202311040

Team Member 3 Name

Dhairya Dave

Team Member 3 Id

202311011

Team Member 4 Name

Viraj Prajapati

Team Member 4 Id

202311069

Category

Reproducibility

Problem Statement

In the era of large-scale recommendation systems, document retrieval, and question-answering, efficient and accurate search techniques are essential. The Maximum Inner Product Search (MIPS) problem, a key operation in vector database systems, is used to find the item that maximizes the inner product with a given query. This operation is computationally expensive in high-dimensional vector spaces, which necessitates more efficient methods.

In 2024, a group of researchers presented a novel approach for 'Faster Maximum Inner Product Search in High Dimensions' at the International Conference on Machine Learning (ICML). Their work introduces a new framework to reduce computational cost and improve search speed. This project aims to reproduce the methodology outlined in the paper and implement our own version to gain deeper insights, which will ultimately help in designing efficient document retrieval systems from vector database.

Evaluation Strategy

The evaluation focuses on the trade-off between accuracy and speed, comparing the proposed algorithm against eight baseline MIPS algorithms: LSH-MIPS, H2-ALSH-MIPS, NEQ-MIPS, PCA-MIPS, BoundedME, Greedy-MIPS, HNSW-MIPS, and NAPG-MIPS. Additionally, the sample complexity of the algorithms is analyzed.

Dataset

Resources

Paper: Faster Maximum Inner Product Search in High Dimensions ICML 2024 MIPS Wikipedia MIPS Introduction Blog Demonstrating the Application of MIPS Y Combinator Hacker News Forum Understanding the approximate nearest neighbor (ANN) algorithm Faiss: A library for efficient similarity search

parth126 commented 2 months ago
parth126 commented 2 months ago

The proposal is good enough for acceptance.