run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.24k stars 4.95k forks source link

[Bug]: Query id error while running elasticsearch with same index name #14070

Open LikhithRishi opened 2 months ago

LikhithRishi commented 2 months ago

Bug Description

getting error while running elasticsearch with same index name, while entering a query to generate response,

Query id 40b9d01f-b211-413d-b3d4-a799eff700d6 not found in either retriever_dict or query_engine_dict .

How to resolve this error?

some temporary work around what I am doing is: by changing the index name I am not getting the error so, how resolve the error without changing index name?

Version

llama-index==0.10.12

Steps to Reproduce

` from dense_pack2.base import DenseXRetrievalPack as dp2 from llama_index.vector_stores.elasticsearch import ElasticsearchStore

vector_store = ElasticsearchStore( index_name="dense_index_56", es_url="http://localhost:9200", )

retriever_chunk = dp2(
    documents,
    proposition_llm=global_variable.embed_model,
    query_llm=global_variable.llm,
    text_splitter=SentenceSplitter(chunk_size=1024),
    vector_store = vector_store
    )`

Relevant Logs/Tracbacks

No response

dosubot[bot] commented 2 months ago

To resolve the error "Query id 40b9d01f-b211-413d-b3d4-a799eff700d6 not found in either retriever_dict or query_engine_dict" in Elasticsearch without changing the index name, you need to ensure that the query ID is correctly registered in either the retriever_dict or query_engine_dict. This error typically indicates that the query ID is not recognized by the system.

Here is a code snippet to help you ensure that the RetrieverQueryEngine is correctly set up:

from llama_index.vector_stores.elasticsearch import ElasticsearchStore
from dense_pack2.base import DenseXRetrievalPack as dp2
from llama_index.core import RetrieverQueryEngine, QueryBundle

# Initialize ElasticsearchStore
vector_store = ElasticsearchStore(
    index_name="dense_index_56",
    es_url="http://localhost:9200",
)

# Initialize DenseXRetrievalPack
retriever_chunk = dp2(
    documents,
    proposition_llm=global_variable.embed_model,
    query_llm=global_variable.llm,
    text_splitter=SentenceSplitter(chunk_size=1024),
    vector_store=vector_store
)

# Ensure the retriever and query engine are properly instantiated
retriever = retriever_chunk.retriever
response_synthesizer = retriever_chunk.query_engine.response_synthesizer

# Instantiate the RetrieverQueryEngine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer
)

# Register the query ID in the appropriate dictionary
query_id = "40b9d01f-b211-413d-b3d4-a799eff700d6"
retriever_dict = {query_id: retriever}
query_engine_dict = {query_id: query_engine}

# Now you can use the query engine to handle queries
query_bundle = QueryBundle(query_str="your query here")
response = query_engine._query(query_bundle)
print(response)

Ensure that the query_id is correctly added to the retriever_dict or query_engine_dict and that the RetrieverQueryEngine is properly instantiated with the necessary components. This should resolve the error without changing the index name [1].

To continue talking to Dosu, mention @dosu.

logan-markewich commented 2 months ago

@LikhithRishi ah, this proof-of-concept llama pack wasn't quite built with external vector stores in mind

We need to store all_nodes in the docstore (and save and load the docstore if we want to rebuild later)

A quick change to the source code would be like this (to allow passing in the vector store and storing nodes in the docstore)

        self.vector_index = VectorStoreIndex(
            all_nodes, 
            service_context=service_context,
            storage_context=StorageContext.from_defaults(vector_store=vector_store),
            show_progress=True,
            store_nodes_override=True,
        )

contributions certainly welcome, or you can download/edit the code and run locally

LikhithRishi commented 2 months ago

@LikhithRishi ah, this proof-of-concept llama pack wasn't quite built with external vector stores in mind

We need to store all_nodes in the docstore (and save and load the docstore if we want to rebuild later)

A quick change to the source code would be like this (to allow passing in the vector store and storing nodes in the docstore)

        self.vector_index = VectorStoreIndex(
            all_nodes, 
            service_context=service_context,
            storage_context=StorageContext.from_defaults(vector_store=vector_store),
            show_progress=True,
            store_nodes_override=True,
        )

contributions certainly welcome, or you can download/edit the code and run locally

Hi @logan-markewich , we tried this one but it is not working , below are the code lines we are using: `
storage_context = StorageContext.from_defaults(vector_store=vector_store)

    self.vector_index = VectorStoreIndex(
        all_nodes, service_context=service_context, show_progress=True, storage_context=storage_context,

store_nodes_override=True ) `