run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.39k stars 4.67k forks source link

[Question]: How to add new SQLTableSchema to an existing ChromaDB embedding? #13855

Open Chakradhar448 opened 1 month ago

Chakradhar448 commented 1 month ago

Question Validation

Question

below is the code used to create the embedding, code: """ from llama_index.core import SQLDatabase from llama_index.core.objects import ( SQLTableNodeMapping, ObjectIndex, SQLTableSchema, ) import chromadb from llama_index.vector_stores.chroma import ChromaVectorStore from llama_index.core import StorageContext, VectorStoreIndex

sql_database = SQLDatabase(engine)

table_schema_objs = [
    SQLTableSchema(table_name=t.table_name, context_str=t.table_summary)
    for t in table_infos
] 

print("Creating chromadb", chroma_path)
# create and save chroma
db = chromadb.PersistentClient(path=chroma_path)

chroma_collection = db.get_or_create_collection("table_schema")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
object_index = ObjectIndex.from_objects(
    table_schema_objs,
    table_node_mapping,
    storage_context=storage_context)

obj_retriever = object_index.as_retriever(similarity_top_k=3)

"""

so once we created this embedding if we need to add more table schemas, how do we achieve that? Please help.

dosubot[bot] commented 1 month ago

To add a new SQLTableSchema to an existing ChromaDB embedding, you can use the add method of the ChromaVectorStore class. Here is an example based on your provided code:

from llama_index.core.objects import SQLTableSchema
from llama_index.vector_stores.chroma import ChromaVectorStore

# Assuming you have an existing ChromaVectorStore instance
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Define your new SQLTableSchema
new_sql_table_schema = SQLTableSchema(
    table_name="new_table",
    columns=[
        {"name": "column1", "type": "TEXT"},
        {"name": "column2", "type": "INTEGER"},
    ],
    primary_key="column1"
)

# Add the new SQLTableSchema to the existing ChromaDB embedding
vector_store.add([new_sql_table_schema])

This code snippet demonstrates how to create a new SQLTableSchema and add it to an existing ChromaDB embedding using the add method of the ChromaVectorStore class [1].

To continue talking to Dosu, mention @dosu.

Chakradhar448 commented 1 month ago

Hi thanks for the response.

But could you please help me, on what if there was change in just the column. How to update the existing embedding for that tableschema, Please

dosubot[bot] commented 1 month ago

To update the existing embedding if there is a change in just the column of an existing SQLTableSchema, you would need to regenerate the TextNode for the updated schema and then update the embedding in the ChromaVectorStore. Here is how you can do it:

  1. Regenerate the TextNode for the updated schema:

    from llama_index.core.objects import SQLTableNodeMapping, SQLTableSchema
    
    # Assuming you have an instance of SQLTableNodeMapping and an updated SQLTableSchema
    sql_table_node_mapping = SQLTableNodeMapping(sql_database)
    updated_schema = SQLTableSchema(table_name="your_table_name", context_str="new_context")
    
    # Generate the updated TextNode
    updated_node = sql_table_node_mapping.to_node(updated_schema)
  2. Update the embedding in the ChromaVectorStore:

    from llama_index.vector_stores.chroma import ChromaVectorStore
    
    # Assuming you have an existing ChromaVectorStore instance
    vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
    
    # Step 1: Delete the existing node
    vector_store.delete(ref_doc_id=updated_node.node_id)
    
    # Step 2: Add the updated node
    vector_store.add(nodes=[updated_node])

This approach ensures that the old embedding is removed and the new embedding is added to the collection, reflecting the updated schema changes [1][2].

To continue talking to Dosu, mention @dosu.