Project: PhotoRAG - Image search application

dubscode commented 2 months ago

Project Name

PhotoRAG

Description

PhotoRAG is a fullstack Next.JS image search application that leverages Azure AI and infrastructure to implement a Retrieval-Augmented Generation (RAG) system for photographs. This project showcases the power of combining vector embeddings, similarity search, and large language models to create an intelligent and efficient image retrieval system.

Data Sources

We seeded the database using a collection of our own photographs. You can view all of the available images in the database at https://rag.photomuse.ai/gallery, and click the Load Gallery button.

Data Ingestion

When we uploaded the images, we utilized the Azure AI Computer Vision API.

computervision/imageanalysis:analyze: Was utilized to generate a caption and tags from the supplied image.

We then provided the tags and computer vision caption to GPT-4o, and prompted it to create a more thorough image description that would improve search result accuracy.

The tags were stored as an array in our Azure Postgresql Server database.

The image description was stored as a string, and we then used

computervision/retrieval:vectorizeImage for the image, and
computervision/retrieval:vectorizeText for the description, storing both vectors in the same images postgres table.

How It Works

Image Upload: When an image is uploaded, it's stored in Azure Blob Storage.
Image Analysis: The image is analyzed using Azure Computer Vision to generate tags and captions.
Description Generation: GPT-4 is used to generate a detailed description based on the tags and captions.
Vector Embedding: The description is converted into a vector embedding using Azure OpenAI.
Search: When a user performs a search:
- The query is refined using GPT-4 to extract relevant tags and improve the search terms.
- The refined query is converted to a vector embedding.
- A similarity search is performed using cosine similarity between the query embedding and the stored image embeddings.
- Results are ranked based on similarity and tag matches.

Features

Image upload and automatic description generation using Azure Computer Vision and GPT-4
Automatic tagging of images
Vector embedding of image descriptions for efficient similarity search
Natural language querying of the image database
Refined search queries using AI
Confidence scoring and explanations for search results

Technology Stack

Next.js (App Router)
TypeScript
PostgreSQL with pgvector extension
Drizzle ORM
Azure OpenAI API (for GPT-4)
Azure Computer Vision API (for analysis and embeddings)
Azure Blob Storage

Technology & Languages

[X] JavaScript
[ ] Java
[ ] .NET
[ ] Python
[ ] AI Studio
[ ] AI Search
[X] PostgreSQL
[ ] Cosmos DB
[ ] Azure SQL

Team Members

dubscode

multispark commented 1 month ago

Hello @dubscode, thank you for participating in RAG Hack!

The team is working hard to distribute badges. Please have each team member fill out this form: aka.ms/raghack/badge-dist

Thank you!

dubscode commented 1 month ago

Thank you very much!

microsoft / RAG_Hack