UMass-Rescue / ImageSearch_CLIP

A system that creates clip embedding vectors for a large corpus of images, enabling efficient retrieval of images based on natural language text.
0 stars 0 forks source link

Research Similarity Metrics for Embedding Comparison #9

Closed sravanigona closed 2 weeks ago

sravanigona commented 2 weeks ago

Description: Research different similarity metrics (e.g., cosine similarity, Euclidean distance, etc.) for comparing image and text embeddings. Analyze their strengths, weaknesses, and suitability for the project. Select the most appropriate similarity metric for comparing embeddings in the image search context.

Outcome: A document summarizing the research on similarity metrics, including recommendations for the most appropriate metric to use for comparing embeddings in the project.

sravanigona commented 2 weeks ago

After researching various similarity metrics (e.g., cosine similarity, Euclidean distance, etc.), cosine similarity has been determined to be the most appropriate metric for comparing image and text embeddings in this project.

Cosine similarity effectively measures the cosine of the angle between two vectors, making it ideal for high-dimensional embedding spaces where the magnitude of vectors is less important than their direction. It is particularly suited for comparing the similarity of text and image embeddings generated by the CLIP model, as it handles different vector lengths and focuses on the alignment between the embeddings.

For a detailed analysis and comparison of the metrics, please refer to the research document.