langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.87k stars 15.36k forks source link

Error running vector search in Azure Cognitive Search - The 'value' property of the vector query can't be null or an empty array. #9297

Closed dolphinsting closed 1 year ago

dolphinsting commented 1 year ago

System Info

When I run a vector search in Azure Cognitive Search using AzureSearch it fails saying, "The 'value' property of the vector query can't be null or an empty array." (full error at the bottom) My code hasn't changed from last week when it used to work. I've got version 11.4.0b6 of azure-search-documents installed. I suspect that Cognitive Search has changed its signature or implementation but the Langchain connection stuff hasn't been updated.

Someone else has reported the same error message but the workaround doesn't work for me (https://github.com/langchain-ai/langchain/issues/7841).

The following is a simplified version of the code which I got from the samples at https://python.langchain.com/docs/integrations/vectorstores/azuresearch. This also fails with the same error.

`import os from langchain.embeddings.openai import OpenAIEmbeddings from langchain.vectorstores.azuresearch import AzureSearch

os.environ["AZURE_COGNITIVE_SEARCH_SERVICE_NAME"] = '' os.environ["AZURE_COGNITIVE_SEARCH_INDEX_NAME"] = '' os.environ["AZURE_COGNITIVE_SEARCH_API_KEY"] = '' os.environ["OPENAI_API_KEY"] = '' model = 'text-embedding-ada-002'

embeddings: OpenAIEmbeddings = OpenAIEmbeddings(deployment=model, chunk_size=1) index_name: str = "website" vector_store: AzureSearch = AzureSearch( azure_search_endpoint='https://.search.windows.net', azure_search_key='', index_name=index_name, embedding_function=embeddings.embed_query, )

docs = vector_store.hybrid_search( query="Test query", k=3 ) print(docs[0].page_content) `

The full error is: Exception has occurred: HttpResponseError (InvalidRequestParameter) The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' Parameter name: vector Code: InvalidRequestParameter Message: The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' Parameter name: vector Exception Details: (InvalidVectorQuery) The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' Code: InvalidVectorQuery Message: The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' File "C:\Work\AI Search\SearchCognitiveSearchScratch.py", line 22, in docs = vector_store.hybrid_search( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ azure.core.exceptions.HttpResponseError: (InvalidRequestParameter) The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' Parameter name: vector Code: InvalidRequestParameter Message: The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' Parameter name: vector Exception Details: (InvalidVectorQuery) The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }' Code: InvalidVectorQuery Message: The 'value' property of the vector query can't be null or an empty array. Make sure to enclose the vector within a "value" property: '{"vector": { "value": [ ] } }'

Who can help?

No response

Information

Related Components

Reproduction

Run: pip install azure-search-documents==11.4.0b6 pip install azure-identity

Run (replacing with the keys, service names, index names and service endpoint): import os from langchain.embeddings.openai import OpenAIEmbeddings from langchain.vectorstores.azuresearch import AzureSearch

os.environ["AZURE_COGNITIVE_SEARCH_SERVICE_NAME"] = '' os.environ["AZURE_COGNITIVE_SEARCH_INDEX_NAME"] = '' os.environ["AZURE_COGNITIVE_SEARCH_API_KEY"] = '' os.environ["OPENAI_API_KEY"] = '' model = 'text-embedding-ada-002'

embeddings: OpenAIEmbeddings = OpenAIEmbeddings(deployment=model, chunk_size=1) index_name: str = "website" vector_store: AzureSearch = AzureSearch( azure_search_endpoint='https://.search.windows.net', azure_search_key='', index_name=index_name, embedding_function=embeddings.embed_query, )

docs = vector_store.hybrid_search( query="Test query", k=3 ) print(docs[0].page_content)

Expected behavior

Return documents from the index.

dosubot[bot] commented 1 year ago

🤖

Hello,

Thank you for providing detailed information about the issue you're facing. The error message you're seeing suggests that the vector being passed to Azure Cognitive Search is either null or an empty array. This could be due to the embedding_function not returning a valid vector for the query text.

In your code, the embedding_function is set to embeddings.embed_query in the AzureSearch constructor. If the embeddings.embed_query function fails to generate a valid vector (i.e., it returns null or an empty array), this could lead to the error you're seeing.

Here are some steps you can take to troubleshoot this issue:

  1. Check the output of the embeddings.embed_query for different query texts to see if it's returning valid vectors. You can do this by calling embeddings.embed_query("Test query") and printing the result.

  2. If the embeddings.embed_query is not returning valid vectors, investigate the possible reasons for this. For example, if the embeddings.embed_query uses a machine learning model to generate the vector, and the model has been updated or retrained, this could potentially affect the output of the embeddings.embed_query.

  3. If the embeddings.embed_query is returning valid vectors, check the rest of your code to see if there's anything else that could be causing the vector to become null or an empty array before it's passed to Azure Cognitive Search.

If you're still facing the issue after trying these steps, please provide more information about your environment (e.g., the version of LangChain you're using, the version of the libraries that embeddings.embed_query depends on) and any changes that might have occurred recently (e.g., updates to the data, retraining of the model).

I hope this helps! Let me know if you have any other questions.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dolphinsting commented 1 year ago

embeddings.embed_query did return a result. Ultimately I fixed it by upgrading my version of langchain. I'm glad I fixed it but it makes me worry that this is going to happen without warning in production.