Closed Tomly-Hola-Lee closed 4 months ago
🤖
Hello again @Tomly-Hola-Lee! It's good to see you diving further into the LlamaIndex library. Let's see what we can do about your question.
Based on the information you've provided and the context from the LlamaIndex repository, it seems like the issue might be related to how the vector index data is being stored and retrieved in Firestore.
The FirestoreKVStore
class in the LlamaIndex framework is responsible for storing and retrieving data from Firestore. It uses methods like put
, aput
, put_all
, and aput_all
to store data, and get
, aget
, get_all
, and aget_all
to retrieve data. These methods use the replace_field_name_set
and replace_field_name_get
functions to replace certain field names before storing and after retrieving the data.
If the vector index data contains the field names "data" or "type", they will be replaced with "data" and "type" when stored, and replaced back when retrieved. If the vector index data does not contain these field names, or if they are not replaced correctly, this could result in an empty response when retrieving the data.
Here are some steps you can take to troubleshoot this issue:
Check if the vector index data is being stored correctly in Firestore. You can do this by checking the Firestore database directly.
Verify that the vector index data contains the field names "data" and "type". If it does not, you may need to modify the data or the replace_field_name_set
and replace_field_name_get
functions to handle the correct field names.
Review the code that is using the FirestoreKVStore
class to store and retrieve the vector index data. Ensure that it is using the correct collection name and key when calling the put
and get
methods.
If the issue persists, you may need to debug the FirestoreKVStore
class to ensure that the put
, get
, replace_field_name_set
, and replace_field_name_get
methods are working correctly.
You can find more information about the FirestoreKVStore
class in the firestore_kvstore.py file in the LlamaIndex repository.
As for the VectorStoreIndex.from_documents
function, I wasn't able to find specific information about it in the LlamaIndex repository. It would be helpful if you could provide more details about the data you're passing to this function and the output you're expecting. This could help in identifying any potential issues with how this function is being used.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello again @Tomly-Hola-Lee! It's good to see you diving further into the LlamaIndex library. Let's see what we can do about your question.
Based on the information you've provided and the context from the LlamaIndex repository, it seems like the issue might be related to how the vector index data is being stored and retrieved in Firestore.
The
FirestoreKVStore
class in the LlamaIndex framework is responsible for storing and retrieving data from Firestore. It uses methods likeput
,aput
,put_all
, andaput_all
to store data, andget
,aget
,get_all
, andaget_all
to retrieve data. These methods use thereplace_field_name_set
andreplace_field_name_get
functions to replace certain field names before storing and after retrieving the data.If the vector index data contains the field names "data" or "type", they will be replaced with "data" and "type" when stored, and replaced back when retrieved. If the vector index data does not contain these field names, or if they are not replaced correctly, this could result in an empty response when retrieving the data.
Here are some steps you can take to troubleshoot this issue:
- Check if the vector index data is being stored correctly in Firestore. You can do this by checking the Firestore database directly.
- Verify that the vector index data contains the field names "data" and "type". If it does not, you may need to modify the data or the
replace_field_name_set
andreplace_field_name_get
functions to handle the correct field names.- Review the code that is using the
FirestoreKVStore
class to store and retrieve the vector index data. Ensure that it is using the correct collection name and key when calling theput
andget
methods.- If the issue persists, you may need to debug the
FirestoreKVStore
class to ensure that theput
,get
,replace_field_name_set
, andreplace_field_name_get
methods are working correctly.You can find more information about the
FirestoreKVStore
class in the firestore_kvstore.py file in the LlamaIndex repository.As for the
VectorStoreIndex.from_documents
function, I wasn't able to find specific information about it in the LlamaIndex repository. It would be helpful if you could provide more details about the data you're passing to this function and the output you're expecting. This could help in identifying any potential issues with how this function is being used.Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosu-bot I notice everything work fine until I use the
vector_index = load_index_from_storage( storage_context=storage_context, index_id=vector_id )
It does return
<llama_index.indices.vector_store.base.VectorStoreIndex object at 0x7fd4a9bab1f0>
However, it still gives empty response
If i am not separating the query part and not using
load_index_from_storage( storage_context=storage_context, index_id=vector_id )
It gives me a response. Is there anything I miss?
can you try using https://docs.llamaindex.ai/en/stable/understanding/tracing_and_debugging/tracing_and_debugging.html for debugging which part yield empty results
@hatianzhang
can you try using https://docs.llamaindex.ai/en/stable/understanding/tracing_and_debugging/tracing_and_debugging.html for debugging which part yield empty results
Thanks for suggestion I have open the debug mode and this is the response when I don't recreate storage context i.e. make query after indexing
vector_index = VectorStoreIndex(nodes, storage_context=storage_context)
query_engine = vector_index.as_query_engine()
vector_response = query_engine.query("Who is the author?")
print(vector_response)
The debug panel shows:
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.
[Node 0f9ec477-71c6-4d02-a12a-facf05a1ef45] [Similarity score: 0.773803] All that seemed left for philosophy were edge cases that people in other fields felt could safely... [Node 73561cee-5ff5-49d7-9c4f-c048bd65d0ad] [Similarity score: 0.773635] I remember taking the boys to the coast on a sunny day in 2015 and figuring out how to deal with ... Top 2 nodes: [Node 0f9ec477-71c6-4d02-a12a-facf05a1ef45] [Similarity score: 0.773803] All that seemed left for philosophy were edge cases that people in other fields felt could safely... [Node 73561cee-5ff5-49d7-9c4f-c048bd65d0ad] [Similarity score: 0.773635] I remember taking the boys to the coast on a sunny day in 2015 and figuring out how to deal with ... DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are an expert Q&A system that is trusted around the world.\nAlways answer the query using the provided context information, and not prior knowledge.\nSome rules to follow:\n1. Never directly reference the given context in your answer.\n2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines."}, {'role': 'user', 'content':...nformation and not prior knowledge, answer the query.\nQuery: Who is the author?\nAnswer: '}], 'model': 'gpt-3.5-turbo', 'stream': False, 'temperature': 0.1}} Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are an expert Q&A system that is trusted around the world.\nAlways answer the query using the provided context information, and not prior knowledge.\nSome rules to follow:\n1. Never directl...answer the query.\nQuery: Who is the author?\nAnswer: '}], 'model': 'gpt-3.5-turbo', 'stream': False, 'temperature': 0.1}} DEBUG:httpx._client:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" DEBUG:openai._base_client:HTTP Request: POST https://api.openai.com/v1/chat/completions "200 OK" HTTP Request: POST https://api.openai.com/v1/chat/completions "200 OK" The author of the text is Paul Graham.
However, if I restore storage context with vector id as below
kvstore = FirestoreKVStore()
storage_context = StorageContext.from_defaults(
docstore=FirestoreDocumentStore(kvstore),
index_store=FirestoreIndexStore(kvstore)
)
vector_id='28ddc079-bc01-4947-a543-455dbdb8700a'
vector_index = load_index_from_storage(
vector_id=vector_id,storage_context=storage_context
)
query_engine = vector_index.as_query_engine()
vector_response = query_engine.query("Who is the author?")
print(vector_response)
It gives the following
DEBUG:google.auth._default:Checking None for explicit credentials as part of auth process...
Checking None for explicit credentials as part of auth process...
DEBUG:google.auth._default:Checking Cloud SDK credentials as part of auth process...
Checking Cloud SDK credentials as part of auth process...
DEBUG:google.auth._default:Checking None for explicit credentials as part of auth process...
Checking None for explicit credentials as part of auth process...
DEBUG:google.auth._default:Checking Cloud SDK credentials as part of auth process...
Checking Cloud SDK credentials as part of auth process...
DEBUG:google.auth._default:Checking None for explicit credentials as part of auth process...
Checking None for explicit credentials as part of auth process...
DEBUG:google.auth._default:Checking Cloud SDK credentials as part of auth process...
Checking Cloud SDK credentials as part of auth process...
DEBUG:google.auth._default:Checking None for explicit credentials as part of auth process...
Checking None for explicit credentials as part of auth process...
DEBUG:google.auth._default:Checking Cloud SDK credentials as part of auth process...
Checking Cloud SDK credentials as part of auth process...
INFO:llama_index.indices.loading:Loading all indices.
Loading all indices.
Loading all indices.
DEBUG:google.auth.transport.requests:Making request: POST https://oauth2.googleapis.com/token
Making request: POST https://oauth2.googleapis.com/token
Making request: POST https://oauth2.googleapis.com/token
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): oauth2.googleapis.com:443
Starting new HTTPS connection (1): oauth2.googleapis.com:443
Starting new HTTPS connection (1): oauth2.googleapis.com:443
DEBUG:urllib3.connectionpool:https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None
https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None
https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.
Top 0 nodes:
Top 0 nodes:
Empty Response
May I know what make the problem?
index_id=vector_id
@hatianzhang Hi thanks for suggestion.
Unfortunately I have tried with index_id=vector_id
and even index_id=vector_id. vector_id=vetor_id
but they didnt work either. I shall get back to you with the debug log
I print vector_id and it give me this : 28ddc079-bc01-4947-a543-455dbdb8700a
and I use it with
vector_id='28ddc079-bc01-4947-a543-455dbdb8700a' vector_index = load_index_from_storage( vector_id=vector_id,storage_context=storage_context )
How do I know if it contains enough relevant information?
@hatianzhang Sorry for late reply.
I have fixed the index_id=vector_id
as below
`` vector_id='0999d828-4f66-46d4-a77c-68c4ce8aba6c' vector_index = load_index_from_storage( storage_context=storage_context, index_id=vector_id )
chatgpt = OpenAI(temperature=0, model="gpt-3.5-turbo") service_context_chatgpt = ServiceContext.from_defaults( llm=chatgpt, chunk_size=1024 )
query_engine = vector_index.as_query_engine() vector_response = vector_index.as_query_engine().query("Who is the author?")
display_response(vector_response) ``
And it still gives "Empty response". The debug log as below INFO:llama_index.indices.loading:Loading indices with ids: ['0999d828-4f66-46d4-a77c-68c4ce8aba6c'] Loading indices with ids: ['0999d828-4f66-46d4-a77c-68c4ce8aba6c'] DEBUG:google.auth.transport.requests:Making request: POST https://oauth2.googleapis.com/token Making request: POST https://oauth2.googleapis.com/token DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): oauth2.googleapis.com:443 Starting new HTTPS connection (1): oauth2.googleapis.com:443 DEBUG:urllib3.connectionpool:https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.
Top 0 nodes:
Final Response: Empty Response
I have checked with the index in firestore and it looks like this:
May I know if it containing enough relevant information?
Hi, @Tomly-Hola-Lee,
I'm helping the LlamaIndex team manage their backlog and am marking this issue as stale.
It looks like you're experiencing issues with using vector index with Firestore, and you've tried using Summary index and keyword index, which worked. There have been detailed troubleshooting steps provided, and you've shared debug logs and further details about your attempts to resolve the issue. The current status of the issue is unresolved.
Could you please confirm if this issue is still relevant to the latest version of the LlamaIndex repository? If it is, please let the LlamaIndex team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you!
Question Validation
Question
I am following the FirestoreDemo with vector index (https://github.com/run-llama/llama_index/blob/main/docs/examples/docstore/FirestoreDemo.ipynb). I have touched up my code as below