microsoft / RAG_Hack

Hack Together: RAG Hack | Register, Learn, Hack
MIT License
403 stars 84 forks source link

Project: PhotoRAG - Image search application #107

Open dubscode opened 2 months ago

dubscode commented 2 months ago

Project Name

PhotoRAG

Description

PhotoRAG is a fullstack Next.JS image search application that leverages Azure AI and infrastructure to implement a Retrieval-Augmented Generation (RAG) system for photographs. This project showcases the power of combining vector embeddings, similarity search, and large language models to create an intelligent and efficient image retrieval system.

Data Sources

We seeded the database using a collection of our own photographs. You can view all of the available images in the database at https://rag.photomuse.ai/gallery, and click the Load Gallery button.

Data Ingestion

When we uploaded the images, we utilized the Azure AI Computer Vision API.

computervision/imageanalysis:analyze: Was utilized to generate a caption and tags from the supplied image.

We then provided the tags and computer vision caption to GPT-4o, and prompted it to create a more thorough image description that would improve search result accuracy.

The tags were stored as an array in our Azure Postgresql Server database.

The image description was stored as a string, and we then used

How It Works

  1. Image Upload: When an image is uploaded, it's stored in Azure Blob Storage.
  2. Image Analysis: The image is analyzed using Azure Computer Vision to generate tags and captions.
  3. Description Generation: GPT-4 is used to generate a detailed description based on the tags and captions.
  4. Vector Embedding: The description is converted into a vector embedding using Azure OpenAI.
  5. Search: When a user performs a search:
    • The query is refined using GPT-4 to extract relevant tags and improve the search terms.
    • The refined query is converted to a vector embedding.
    • A similarity search is performed using cosine similarity between the query embedding and the stored image embeddings.
    • Results are ranked based on similarity and tag matches.

Features

Technology Stack

Technology & Languages

Project Repository URL

https://github.com/dubscode/photorag

Deployed Endpoint URL

https://rag.photomuse.ai/

Project Video

https://youtu.be/JvHKW363nwo?si=oLjJcrkQbMuQOVXe

Team Members

dubscode

multispark commented 1 month ago

Hello @dubscode, thank you for participating in RAG Hack!

The team is working hard to distribute badges. Please have each team member fill out this form: aka.ms/raghack/badge-dist

Thank you!

dubscode commented 1 month ago

Thank you very much!