run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.74k stars 5.27k forks source link

[Question]: AzureAiSearch and Text2SQL vector index use #11918

Closed PrasenjitGiri closed 4 months ago

PrasenjitGiri commented 8 months ago

Question Validation

Question

[Question]: Ive not found a way on how to use azureaisearch with text2sql. all example onk\ly show in memory, but i waant to save my index to azureai and use that in text2sql

dosubot[bot] commented 8 months ago

Hey @PrasenjitGiri, great to see you diving into another challenge! Always a pleasure to have you around.

To use AzureAiSearch with Text2SQL and save your index to AzureAI, you need to follow these steps:

  1. Initialization: Instantiate the AzureAISearchVectorStore class with the necessary parameters, including the Azure Search client (search_or_index_client), field keys (id_field_key, chunk_field_key, embedding_field_key, metadata_string_field_key, doc_id_field_key), and the index_name. Ensure that the index_management parameter is set to IndexManagement.CREATE_IF_NOT_EXISTS if you want the index to be created if it does not already exist.

  2. Index Creation: If the index does not exist and index_management is set to create it, define your index schema by specifying the fields and their types. This is implicitly handled in the provided code when calling self._index_client.create_index(index) within the initialization process if the index does not exist.

  3. Document Preparation: For each document you want to index, use the _create_index_document method to prepare the document. This method takes a node (which should contain the content, embedding, and any metadata) and returns a dictionary formatted for Azure Search, including serialized metadata.

  4. Batch Upload: Azure Search supports batch operations for efficiency. Accumulate your documents and use the merge_or_upload_documents method of the Azure Search client (self._search_client) to upload them in batches. The provided code automatically handles batches of 10 documents for uploading. Ensure that you call this method for any remaining documents after processing all your nodes.

  5. Querying: To query the index, use the query method of the AzureAISearchVectorStore class. This method allows for different types of queries (e.g., sparse, hybrid, semantic hybrid) based on the query mode and supports filtering using OData syntax.

  6. Deletion: To delete documents from the index, use the delete method, specifying the ref_doc_id of the documents you want to remove.

Here's a simplified example of how you might use the AzureAISearchVectorStore class to index and query documents:

from azure.search.documents import SearchClient, SearchIndexClient
from azure.search.documents.indexes.models import ComplexField, SearchIndex
from azure.core.credentials import AzureKeyCredential

# Initialize the Azure Search clients
search_index_client = SearchIndexClient(endpoint="https://<your-service-name>.search.windows.net",
                                        index_name="<your-index-name>",
                                        credential=AzureKeyCredential("<your-admin-key>"))
search_client = SearchClient(endpoint="https://<your-service-name>.search.windows.net",
                             index_name="<your-index-name>",
                             credential=AzureKeyCredential("<your-query-key>"))

# Instantiate the AzureAISearchVectorStore
vector_store = AzureAISearchVectorStore(search_or_index_client=search_index_client,
                                        id_field_key="id",
                                        chunk_field_key="chunk",
                                        embedding_field_key="embedding",
                                        metadata_string_field_key="metadata",
                                        doc_id_field_key="doc_id",
                                        index_name="<your-index-name>",
                                        index_management=IndexManagement.CREATE_IF_NOT_EXISTS)

# Prepare and upload documents (not shown: document preparation)
# vector_store.merge_or_upload_documents(prepared_documents)

# Query the index
# results = vector_store.query(your_query)

This example is a high-level overview. You'll need to adapt it to your specific requirements, including setting up the Azure Search service, defining the index schema, preparing your documents, and handling responses from queries.

You can find more details in the LlamaIndex repository.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
PrasenjitGiri commented 8 months ago

This is not helping. what im looking for how to use AzureAiSearch wtih Text2SQL, the given examples are not allowing me to plug in anything other than the in memory db.