Closed mballav closed 3 months ago
To use the gpt-4o
model for Retrieval-Augmented Generation (RAG) on PDF/txt documents, you need to use the AzureOpenAIMultiModal
class. The AzureOpenAI
class does not support the gpt-4o
model, which is why you are encountering the error.
Here is an example of how to configure and use the AzureOpenAIMultiModal
class for your task:
Set Environment Variables: Ensure the following environment variables are set:
OPENAI_API_VERSION
: 2023-05-15
AZURE_OPENAI_ENDPOINT
: Your Azure endpoint URL, e.g., https://YOUR_RESOURCE_NAME.openai.azure.com/
AZURE_OPENAI_API_KEY
: Your Azure API keyInitialize the AzureOpenAIMultiModal
Class:
Use the following code to initialize the AzureOpenAIMultiModal
class:
from llama_index.legacy.multi_modal_llms.azure_openai import AzureOpenAIMultiModal
azure_openai_multi_modal = AzureOpenAIMultiModal(
model="gpt-4o",
engine="your_deployment_name", # Replace with your deployment name
azure_endpoint="https://YOUR_RESOURCE_NAME.openai.azure.com/", # Replace with your endpoint URL
api_key="your_api_key", # Replace with your API key
api_version="2023-05-15"
)
Build and Query Multi-Modal Vector Stores:
Use the MultiModalVectorStoreIndex
and SimpleMultiModalQueryEngine
to build and query multi-modal vector stores. Here is an example:
from llama_index.core.indices import MultiModalVectorStoreIndex
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import SimpleDirectoryReader, StorageContext
import qdrant_client
# Create a local Qdrant vector store
client = qdrant_client.QdrantClient(path="qdrant_mm_db")
text_store = QdrantVectorStore(client=client, collection_name="text_collection")
image_store = QdrantVectorStore(client=client, collection_name="image_collection")
storage_context = StorageContext.from_defaults(vector_store=text_store, image_store=image_store)
# Load text and image documents from local folder
documents = SimpleDirectoryReader("./data_folder/").load_data()
# Create the MultiModal index
index = MultiModalVectorStoreIndex.from_documents(documents, storage_context=storage_context)
# Use MultiModal Retriever and Query Engine
from llama_index.core import PromptTemplate
from llama_index.core.query_engine import SimpleMultiModalQueryEngine
retriever_engine = index.as_retriever(similarity_top_k=3, image_similarity_top_k=3)
retrieval_results = retriever_engine.retrieve(response)
qa_tmpl_str = (
"Context information is below.\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Given the context information and not prior knowledge, "
"answer the query.\n"
"Query: {query_str}\n"
"Answer: "
)
qa_tmpl = PromptTemplate(qa_tmpl_str)
query_engine = index.as_query_engine(multi_modal_llm=azure_openai_multi_modal, text_qa_template=qa_tmpl)
query_str = "Tell me more about the Porsche"
response = query_engine.query(query_str)
This setup will allow you to use the gpt-4o
model for RAG on PDF/txt documents with Azure OpenAI [1][2][3][4].
To continue the conversation, mention @dosu.
@dosu If I were to create MultiModalVectorStoreIndex from nodes, how would I do that?
@mballav its supported on the LLM, but you need to update probably
pip install -U llama-index-llms-azure-openai llama-index-llms-openai
Thanks alot, @logan-markewich! You saved my day.
Hi @dosu I ran :pip install -U llama-index-llms-azure-openai llama-index-llms-openai and this line: index = MultiModalVectorStoreIndex.from_documents(documents, storage_context=storage_context)
give me AssertionError: llama_index/legacy/vector_stores/qdrant.py:188, in QdrantVectorStore._build_points(self, nodes) 180 sparse_indices, sparse_vectors = self._sparse_doc_fn( 181 [ 182 node.get_content(metadata_mode=MetadataMode.EMBED) 183 for node in node_batch 184 ], 185 ) 187 for i, node in enumerate(node_batch): --> 188 assert isinstance(node, BaseNode) 189 node_ids.append(node.node_id) 191 if self.enable_hybrid:
AssertionError:
Question Validation
Question
I am trying to use gpt-4o as my model for RAG on PDF/txt documents. Could someone please provide an example of how I can do that?
Do I need to use AzureOpenAIMultiModal class or AzureOpenAI?
When I use AzureOpenAI, it complains about the model not supported. Here is my code:
And, here is the error message: