run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.52k stars 5.22k forks source link

[Question]: query_engine.query("query") returns empty response... #11771

Closed Sridevi17j closed 4 months ago

Sridevi17j commented 8 months ago

Question Validation

Question

Please find my entire code here

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader documents = SimpleDirectoryReader("data1").load_data()

from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2" ) Settings.llm = None index = VectorStoreIndex.from_documents(documents) query_engine = index.as_query_engine() response = query_engine.query("what is this document about") print(response)

Below is the output:

Query: what is this document about Answer:

If i set Settings.llm = Some Huggingface LLM, then it returns "Empty Response" or "My Response" so none of them worked.. Could someone pls advise, what am i doing wrong here, I have uploaded a pdf file to query it, i also tried with simple text files too.. But nothing worked..

dosubot[bot] commented 8 months ago

Hey @Sridevi17j, great to see you back! Hope you're doing well. 😊

Based on the code you've shared and the context provided, it seems like you're trying to query a document using the LlamaIndex framework. However, you're encountering an issue where the query_engine.query("query") function returns an empty response.

There are a few things you could try to resolve this issue:

  1. Ensure the service context is passed back when loading the index: When you're loading the index, make sure to pass the service context. This was a suggestion from a similar issue in the LlamaIndex repository. Here's how you can do it:
new_index = load_index_from_storage(storage_context, service_context=service_context)
  1. Correct the use of the Document class: Another suggestion from the same issue was to use the text= keyword argument instead of page_content= when using the Document class. Here's how you can do it:
from llama_index import Document
docs=[Document(text=doc.page_content, metadata={"source":doc.metadata["source"]}) for doc in documents]
  1. Check the response.source_nodes: If the response is still empty, you could check the response.source_nodes to see if the retrieved nodes are correct. This could help you investigate if there's an issue with the LLM setup. You might also want to consider decreasing the chunk size to 1024 and setting context_window=3800 if you're using llama2.

Remember, the VectorStoreIndex.from_documents(documents) function is used to load a set of documents and build an index from them. This function takes documents, splits them into chunks, parses these chunks into Node objects, and then builds an index from these nodes.

The query_engine.query("query") function, on the other hand, is used to perform a query operation on the knowledge graph. This function orchestrates the process of querying the graph store, retrieving nodes that match the query criteria, synthesizing a final response based on the retrieved nodes, and logging the final response.

I hope this helps! If you're still encountering issues, please provide more details about the documents you're trying to index and the exact error messages you're seeing. This will help us better understand the problem and provide a more accurate solution.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
Sridevi17j commented 7 months ago

Hi @dosu-bot I added below code service_context = ServiceContext.from_defaults(llm=None, embed_model=embed_model)

and i get below error (ServiceContext is deprecated, please use llama_index.settings.Settings instead.) -- Deprecated since version 0.10.0. service_context = ServiceContext.from_defaults(llm=None, embed_model=embed_model)

so i am using like below

from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2" ) Settings.llm = None

also for StorageContext, i could not understand how to use this, and what are all the parameters to be passed in. if you could explain the paramters inside to be passed in, it would be great.. pls provide your inputs...

dosubot[bot] commented 7 months ago

Based on the information you've provided, it seems you're trying to use StorageContext in the LlamaIndex framework. However, without the actual implementation details or constructor signature of StorageContext, it's challenging to specify the exact parameters you should pass.

Here's a general example of how you might proceed based on the typical usage pattern:

from llama_index.core.storage import StorageContext

# Assuming StorageContext requires parameters like 'storage_path' and 'config'
storage_path = "/path/to/storage"
config = {"option1": "value1", "option2": "value2"}

# Instantiate StorageContext with the required parameters
storage_context = StorageContext(storage_path, config)

# Use storage_context as needed

For the exact parameters, you should refer to the StorageContext class definition or documentation. Since ServiceContext is deprecated, ensure that any parameters or initialization patterns you used with ServiceContext are correctly adapted to the new StorageContext requirements.

As for the purpose and functionality of the StorageContext class in the LlamaIndex framework, it serves as a comprehensive management layer for handling different types of storage within the framework. It provides a unified interface to interact with various storage mechanisms used within the LlamaIndex framework. The class allows for adding and retrieving data to/from different stores. The StorageContext supports serialization and deserialization of its state to and from dictionaries. This functionality is crucial for persisting the state of the storage context across sessions or for transferring the state between different components or systems.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
Sridevi17j commented 7 months ago

@dosu-bot , I saved the index like below index.storage_context.persist(persist_dir="test")

and i reloaded the index like below docstore= "test/docstore.json" index_store= "test/index_store.json" vector_stores="test/default_vector_store.json" graph_store= "test/graph_store.json"

service_context = ServiceContext.from_defaults(llm=None, embed_model=embed_model) storage_context = StorageContext(docstore=docstore,index_store=index_store,vector_stores=vector_stores,graph_store=graph_store)

new_index = load_index_from_storage(storage_context, service_context=service_context)

i get below error AttributeError Traceback (most recent call last) Cell In[27], line 12 9 service_context = ServiceContext.from_defaults(llm=None, embed_model=embed_model) 10 storage_context = StorageContext(docstore=docstore,index_store=index_store,vector_stores=vector_stores,graph_store=graph_store) ---> 12 new_index = load_index_from_storage(storage_context, service_context=service_context)

File ~\PycharmProjects\Test1.venv\Lib\site-packages\llama_index\core\indices\loading.py:33, in load_index_from_storage(storage_context, index_id, kwargs) 30 else: 31 index_ids = [index_id] ---> 33 indices = load_indices_from_storage(storage_context, index_ids=index_ids, kwargs) 35 if len(indices) == 0: 36 raise ValueError( 37 "No index in storage context, check if you specified the right persist_dir." 38 )

File ~\PycharmProjects\Test1.venv\Lib\site-packages\llama_index\core\indices\loading.py:64, in load_indices_from_storage(storage_context, index_ids, **kwargs) 62 if index_ids is None: 63 logger.info("Loading all indices.") ---> 64 index_structs = storage_context.index_store.index_structs() 65 else: 66 logger.info(f"Loading indices with ids: {index_ids}")

AttributeError: 'str' object has no attribute 'index_structs'

pls provide your inputs..

dosubot[bot] commented 7 months ago

Based on the error message you provided, it seems like the index_store parameter you're passing to the StorageContext constructor is a string, but the expected type is BaseIndexStore. This is likely the cause of the AttributeError you're encountering, as the string object doesn't have an index_structs attribute.

To resolve this issue, you need to create an instance of BaseIndexStore (or one of its subclasses) and pass that as the index_store parameter when creating a StorageContext. Here's an example of how you might do this:

from llama_index.core.storage.index_store import JsonIndexStore

# Create an instance of JsonIndexStore
index_store_instance = JsonIndexStore("test/index_store.json")

# Pass the instance to the StorageContext constructor
storage_context = StorageContext(docstore=docstore, index_store=index_store_instance, vector_stores=vector_stores, graph_store=graph_store)

This should resolve the AttributeError you're seeing. If you're still encountering issues after making this change, please provide more details about your setup and I'll be happy to help further.

For more information, you can refer to the StorageContext constructor in the LlamaIndex repository.

Also, there was a similar issue reported in the LlamaIndex repository (issue #3426). The solution there involved manually fixing the index_store.json file by changing the string to an actual object and removing the escaped double quotes. However, this seems to be a different issue than what you're experiencing, as your error is occurring when trying to load the index from storage, not when trying to persist it.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
Sridevi17j commented 7 months ago

@dosu-bot in the given links, i saw one line, and i used here for StorageContext for giving the persisten directory path, it did not throw error, but still it gives empty response, check it out

service_context = ServiceContext.from_defaults(llm=None, embed_model=embed_model)

storage_context = StorageContext.from_defaults(persist_dir="test")

new_index = load_index_from_storage(storage_context, service_context=service_context) query_engine = new_index.as_query_engine() response = query_engine.query("what is this document about") print(response)


Query: what is this document about Answer:

dosubot[bot] commented 7 months ago

Based on the code you've shared and the context provided, there are a few potential issues that could be causing the query_engine.query("query") function to return an empty response. Here are some suggestions:

  1. Service Context: Ensure that the service context is passed back when loading the index. This was suggested in a similar issue [Question]: Empty Response after query_engine.query. Your code seems to be doing this correctly, but it's worth double-checking.

  2. Document Class Usage: Another suggestion from the same issue is to correct the use of the Document class by using the text= keyword argument instead of page_content=. Here's the suggested code snippet:

from llama_index import Document
docs=[Document(text=doc.page_content, metadata={"source":doc.metadata["source"]}) for doc in documents]
  1. LLM Setup: If the response is still empty, check the response.source_nodes to see if the retrieved nodes are correct and investigate if there is an issue with the LLM setup. Consider decreasing the chunk size to 1024 and setting context_window=3800 if using llama2.

  2. Sub-questions and Synthesized Response: Check the verbose output of the sub-questions generated by the BaseQuestionGenerator to ensure they are valid and meaningful. Also, check the verbose output of the synthesized response from the BaseSynthesizer. This was suggested in a similar issue [Bug]: Subquestion Query Engine is not returning back the Result..

  3. Input Files: If you're using files as your documents, make sure to use the input_files argument when using SimpleDirectoryReader. This was suggested in a similar issue [Question]: When I used colab, the query didn't answer. Here's the suggested code snippet:

from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader(input_files = ["test2.txt"]).load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
query_engine.query("what is my name?")

Please try these suggestions and let me know if any of them resolve your issue. If not, could you provide more information about your setup, such as the version of LlamaIndex you're using, the type and format of the documents you're trying to query, and any error messages you're seeing? This will help me better understand the problem and provide more targeted assistance.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
Sridevi17j commented 7 months ago

@dosu-bot, have followed all your advise, after using custom LLM, instead of empty response, i am getting "My Response" as a response- which is a dummy response that is set in our LLM, see my code below from llama_index.core import VectorStoreIndex, SimpleDirectoryReader documents = SimpleDirectoryReader(input_files = ["data1/test.txt"]).load_data()

index = VectorStoreIndex.from_documents(documents) index.storage_context.persist(persist_dir="test")

from typing import Optional, List, Mapping, Any from llama_index.core.llms import ( CustomLLM, CompletionResponse, CompletionResponseGen, LLMMetadata, ) from llama_index.core.llms.callbacks import llm_completion_callback

class OurLLM(CustomLLM): context_window: int = 3900 num_output: int = 256

model_name: str = "facebook/opt-iml-1.3b"

model_name: str = "C:/Users/sride/git/lama_test/llama/llama-2-70b-chat"
dummy_response: str = "My response"

@property
def metadata(self) -> LLMMetadata:
    """Get LLM metadata."""
    return LLMMetadata(
        context_window=self.context_window,
        num_output=self.num_output,
        model_name=self.model_name,
    )

@llm_completion_callback()
def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
    return CompletionResponse(text=self.dummy_response)

@llm_completion_callback()
def stream_complete(
    self, prompt: str, **kwargs: Any
) -> CompletionResponseGen:
    response = ""
    for token in self.dummy_response:
        response += token
        yield CompletionResponse(text=response, delta=token)

from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2" ) Settings.chunk_size = 2048 Settings.context_window = 3900 embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2" ) Settings.llm = OurLLM() from llama_index.core.indices.loading import load_index_from_storage from llama_index.core.storage.storage_context import StorageContext from llama_index.core import ServiceContext from llama_index.core.storage.index_store.simple_index_store import SimpleIndexStore

service_context = ServiceContext.from_defaults(chunk_size=2048,llm=OurLLM(), embed_model=embed_model)

storage_context = StorageContext.from_defaults(persist_dir="test")

new_index = load_index_from_storage(storage_context, service_context=service_context) query_engine = new_index.as_query_engine() response = query_engine.query("what is the name") print(response)

My response

I tried setting the settings through Settings also, through ServiceContext... none of them working, tried setting the LLM to None also to CustomLLM, also changed chunk_size, context window... tried all possible way.. still no luck.. meanwhile, please also see my index_store.json below ( Note:- embedding dict is empty) {"index_store/data": {"ec2be150-8870-4f10-ae22-9690fbcdf08b": {"type": "vector_store", "data": "{\"index_id\": \"ec2be150-8870-4f10-ae22-9690fbcdf08b\", \"summary\": null, \"nodes_dict\": {\"22663d7f-cb1b-48e9-b7bc-3a90d5f56c15\": \"22663d7f-cb1b-48e9-b7bc-3a90d5f56c15\"}, \"doc_id_dict\": {}, \"embeddings_dict\": {}}"}}}

What am i doing wrong here?

dosubot[bot] commented 7 months ago

Based on the code you've provided, it seems like the query_engine.query("query") function is returning the dummy response "My Response" because of the implementation of the complete method in your OurLLM class. This method is currently set to return CompletionResponse(text=self.dummy_response), where self.dummy_response is "My response". This means that regardless of the input query, the complete method will always return "My response".

If you want the complete method to return a different response, you'll need to modify its implementation. For example, if you want it to return the input query, you could change it to:

@llm_completion_callback()
def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
    return CompletionResponse(text=prompt)

This will return the input query as the response.

Regarding the content of your index_store.json file, it seems to be missing some key information such as the doc_id_dict and embeddings_dict. This could be due to the VectorStoreIndex.from_documents(documents) not being able to properly process the documents from SimpleDirectoryReader(input_files = ["data1/test.txt"]).load_data(). You might want to check the content and format of your test.txt file.

As for the similar issues in the LlamaIndex repository, it seems that some users have had success by checking the verbose output of the sub-questions generated by the BaseQuestionGenerator and the responses from each QueryEngineTool. You might want to try this approach to see if it helps identify the issue.

Lastly, you might want to consider switching from Llama2 to a different LLM like Zephyr 7b beta for more reliable structured outputs. If necessary, you could also adapt the setup for use with llama-cpp for CPU hardware, but keep in mind that it will be slower.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

For more information, you can refer to the following sources:

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
mlkorra commented 7 months ago

@Sridevi17j have you tried using the chatgpt-3.5 api as LLM?

Sridevi17j commented 7 months ago

@mlkorra I m trying to use Ollama , open source LLM

Sridevi17j commented 7 months ago

I sorted this issue, by printing response in a stream.

Below code gives response.

index = VectorStoreIndex.from_documents(documents)         
query_engine = index.as_query_engine(streaming=True)         
response_stream = query_engine.query("What is this document about?")         
response_stream.print_response_stream()
vonrafael commented 3 months ago

I sorted this issue, by printing response in a stream.

Below code gives response.

index = VectorStoreIndex.from_documents(documents)         
query_engine = index.as_query_engine(streaming=True)         
response_stream = query_engine.query("What is this document about?")         
response_stream.print_response_stream()

I'm sorry but this didn't helped!!

lmaddox commented 4 days ago

Settings.llm = None looks strange to me. You said that trying with a dummy LLM does return a response, though? Have you tried setting the llm to not-None?

Settings.llm = Ollama(model_name=..., model_url=..., request_timeout=..., use_json=False,)

I'm pretty sure the embeddings need to be compatible with your LLM of choice.

Settings.embed_model = OllamaEmbedding(...)

If you're not setting them globally, then you're gonna wanna pass them as args to quite nearly every constructor or factory.

VectorStoreIndex(..., embed_model=...,) # n.b., I'm not seeing an embed_model or llm option for the from_documents factory

as_query_engine does accept an llm parameter.

Adding streaming or async will just complicate things until you get that much working. So far as I can tell, you're using simple storage backends. That's good. Adding, e.g., postgres or redis will require some attention to the embedding dimensions.