Open StarVolcano opened 4 months ago
Hey there, @StarVolcano! I'm here to help you with any bugs, questions, or contributions you have for the repository. Let me know how I can assist you while you wait for a human maintainer.
To fix the KeyError: 'context_str'
in the _get_response_with_images()
function, you need to ensure that the context_str
key is included in the format string of the _image_qa_template
. The current implementation of _get_response_with_images()
only includes query_str
in the format string, which is likely causing the KeyError
.
Here is the corrected version of the _get_response_with_images()
function:
def _get_response_with_images(
self,
prompt_str: str,
image_nodes: List[ImageNode],
) -> RESPONSE_TYPE:
context_str = "\n\n".join([r.get_content() for r in image_nodes])
fmt_prompt = self._image_qa_template.format(
context_str=context_str, query_str=prompt_str
)
llm_response = self._multi_modal_llm.complete(
prompt=fmt_prompt,
image_documents=[image_node.node for image_node in image_nodes],
)
return Response(
response=str(llm_response),
source_nodes=image_nodes,
metadata={"image_nodes": image_nodes},
)
This ensures that context_str
is included in the format string, preventing the KeyError
. Additionally, make sure that the image_qa_template
is correctly set up to accept both context_str
and query_str
as parameters [1][2].
To continue talking to Dosu, mention @dosu.
Bug Description
Got a
KeyError: 'context_str'
when executingquery_engine.image_query(image_path, query_str)
function. I think the problem is that in the_get_response_with_images()
function from theSimpleMultiModalQueryEngine
class in llama_index/core/query_engine/multi_modal.py,content_str
is not provided.fmt_prompt = self._image_qa_template.format( query_str=prompt_str, )
Version
0.10.50
Steps to Reproduce
`from llama_index.core.indices.multi_modal.base import ( MultiModalVectorStoreIndex, ) from llama_index.vector_stores.qdrant import QdrantVectorStore from llama_index.core import SimpleDirectoryReader, StorageContext from llama_index.embeddings.clip import ClipEmbedding
import qdrant_client
client = qdrant_client.QdrantClient(path="qdrant_mm_wiki")
text_store = QdrantVectorStore(client=client, collection_name="text_collection") image_store = QdrantVectorStore(client=client, collection_name="image_collection") storage_context = StorageContext.from_defaults(vector_store=text_store, image_store=image_store)
from llama_index.core import Settings Settings.embed_model = ClipEmbedding() image_embed_model = ClipEmbedding()
documents = SimpleDirectoryReader("./mixed_wiki",recursive=True).load_data()
from llama_index.core.node_parser import SentenceSplitter Settings.text_splitter = SentenceSplitter(chunk_size=60, chunk_overlap=5)
index = MultiModalVectorStoreIndex.from_documents( documents, storage_context=storage_context, transformations=[SentenceSplitter(chunk_size=60, chunk_overlap=5)], image_embed_model=image_embed_model, )
from llama_index.multi_modal_llms.ollama import OllamaMultiModal mm_model = OllamaMultiModal(model="llava:13b")
import numpy as np from llama_index.core.prompts import PromptTemplate from llama_index.core.query_engine import SimpleMultiModalQueryEngine
qa_tmpl_str = ( "Given the images provided, " "answer the query.\n" "Query: {query_str}\n" "Answer: " ) qa_tmpl = PromptTemplate(qa_tmpl_str)
query_str = 'What is the main object in the picture?'
query and response
query_engine = index.as_query_engine(llm=mm_model, text_qa_template=qa_tmpl) response = query_engine.image_query('./mixed_wiki/1.jpg', query_str)`
Relevant Logs/Tracbacks
No response