run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.48k stars 5.01k forks source link

[Question]: How to put BGEM3Index embeddings to mongo atlas vector store #15871

Open takan55 opened 1 week ago

takan55 commented 1 week ago

Question Validation

Question

How can I put BGEM3Index embeddings to mongo atlas vector store? If possible, code samples will be very much appreciated. thanks

dosubot[bot] commented 1 week ago

Hey @takan55! 👋 I'm here to help you with any coding conundrums you may have. Let's tackle this challenge together!

To store BGEM3Index embeddings in a MongoDB Atlas vector store, you can follow these steps. Below is a code sample that demonstrates how to achieve this:

  1. Install Required Libraries: Ensure you have the necessary libraries installed:

    pip install pymongo
    pip install llama-index
    pip install llama-index-vector-stores-mongodb
  2. Connect to MongoDB Atlas: Use pymongo to connect to your MongoDB Atlas cluster:

    import pymongo
    
    def get_mongo_client(mongo_uri):
       """Establish connection to the MongoDB."""
       try:
           client = pymongo.MongoClient(mongo_uri)
           print("Connection to MongoDB successful")
           return client
       except pymongo.errors.ConnectionFailure as e:
           print(f"Connection failed: {e}")
           return None
    
    mongo_uri = "your_mongodb_uri_here"
    mongo_client = get_mongo_client(mongo_uri)
    
    DB_NAME = "your_database_name"
    COLLECTION_NAME = "your_collection_name"
    
    db = mongo_client[DB_NAME]
    collection = db[COLLECTION_NAME]
  3. Store BGEM3Index Embeddings: Assuming you have the embeddings as a list of floating-point numbers, you can store them in MongoDB Atlas:

    from llama_index.vector_stores.mongodb.base import MongoDBAtlasVectorSearch
    
    # Example embedding
    embedding = [
       0.017273761332035065, 0.0003562009078450501, 0.008221239782869816,
       -0.017780985683202744, -0.04745353385806084, -0.008362134918570518,
       # ... (rest of the embedding values)
    ]
    
    # Document to insert
    document = {
       "document_id": "unique_document_id",
       "embedding": embedding,
       "metadata": {
           "title": "Example Document",
           "content": "This is an example document."
       }
    }
    
    # Insert the document into the collection
    collection.insert_one(document)
    print("Document inserted successfully")
  4. Querying the Vector Store: You can query the vector store to retrieve similar embeddings:

    query_embedding = [
       0.017273761332035065, 0.0003562009078450501, 0.008221239782869816,
       # ... (rest of the query embedding values)
    ]
    
    # Example query to find similar embeddings
    results = collection.find({
       "embedding": {
           "$near": {
               "$geometry": {"type": "Point", "coordinates": query_embedding},
               "$maxDistance": 1000  # Adjust the distance as needed
           }
       }
    })
    
    for result in results:
       print(result)

This code demonstrates how to connect to MongoDB Atlas, store BGEM3Index embeddings, and query the vector store for similar embeddings. Make sure to replace placeholders like "your_mongodb_uri_here", "your_database_name", and "your_collection_name" with your actual MongoDB Atlas details [1][2].

To continue talking to Dosu, mention @dosu.