langchain-ai / langchain

šŸ¦œšŸ”— Build context-aware reasoning applications
https://python.langchain.com
MIT License
93.84k stars 15.12k forks source link

AzureSearch giving error during creation of index (NameError: name 'VectorSearch' is not defined. Did you mean: 'vector_search'? ) #17725

Closed Farid-Ullah closed 4 months ago

Farid-Ullah commented 8 months ago

Checked other resources

Example Code

python ...

logging.info("Loading documents from Azure Blob Storage.")
    docs = load_documents_from_blob()

  logging.info("Splitting the loaded documents into chunks.")
  splits = partition_text_into_chunks(docs)

embeddings = AzureOpenAIEmbeddings(
        azure_deployment=os.environ["EMBEDDING_MODEL_DEPLOYMENT"],
        openai_api_version="2023-05-15",
    )

  logging.info("Connecting to Azure Cognitive Search...")
  vector_store: AzureSearch = AzureSearch(
      azure_search_endpoint=azure_search_endpoint,
      azure_search_key=azure_search_key,
      index_name=index_name,
      embedding_function=embeddings.embed_query,
  )
logging.info(
        "Indexing the split documents into Azure Cognitive Search for documents."
    )
    vector_store.add_documents(documents=splits)

Error Message and Stack Trace (if applicable)

Error:

INFO:root:Loading documents from Azure Blob Storage.
INFO:root:Preparing to load data from Azure Blob Storage.
INFO:pikepdf._core:pikepdf C++ to Python logger bridge initialized
INFO:root:Successfully loaded 39 documents from Azure Blob Storage.
INFO:root:Splitting the loaded documents into chunks.
INFO:root:Initializing text splitter with chunk size of 1000 and overlap of 100 characters.
WARNING:langchain.text_splitter:Created a chunk of size 1370, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1235, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 2133, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 6548, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1901, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 5381, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 2180, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1978, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 3180, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 3180, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 3180, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 6581, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 2482, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1266, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1266, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1424, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1353, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1264, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1782, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1285, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1317, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 6141, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1719, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 6119, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1025, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 3017, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1080, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1140, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1365, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1166, which is longer than the specified 1000
WARNING:langchain.text_splitter:Created a chunk of size 1006, which is longer than the specified 1000
INFO:root:Successfully split 39 documents.
INFO:root:Initializing the Azure Cognitive Search model for documents.
INFO:root:Initializing embeddings...
INFO:root:Connecting to Azure Cognitive Search...
INFO:httpx:HTTP Request: POST https://mobiz-gpt-4-deployment.openai.azure.com//openai/deployments/ada-002/embeddings?api-version=2023-05-15 "HTTP/1.1 200 OK"
Traceback (most recent call last):
  File "/home/ayaz/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/langchain_community/vectorstores/azuresearch.py", line 111, in _get_search_client
    index_client.get_index(name=index_name)
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/azure/search/documents/indexes/_search_index_client.py", line 144, in get_index
    result = self._client.indexes.get(name, **kwargs)
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/azure/search/documents/indexes/_generated/operations/_indexes_operations.py", line 864, in get
    map_error(status_code=response.status_code, response=response, error_map=error_map)
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/azure/core/exceptions.py", line 164, in map_error
    raise error
azure.core.exceptions.ResourceNotFoundError: () No index with the name 'apollo-knowledge-base' was found in the service 'knowledge-bot-basic-15'.
Code: 
Message: No index with the name 'apollo-knowledge-base' was found in the service 'knowledge-bot-basic-15'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/Desktop/dev/ALL DEV CODE/hybrid-sql-agent/backend/main.py", line 3, in <module>
    from api.routes import hybrid_agent
  File "/home/user/Desktop/dev/ALL DEV CODE/hybrid-sql-agent/backend/api/routes/hybrid_agent.py", line 7, in <module>
    from ai.main import apollo_conversation_chain
  File "/home/user/Desktop/dev/ALL DEV CODE/hybrid-sql-agent/backend/ai/main.py", line 28, in <module>
    acs_documents, acs_fewshots = process_and_index_data_to_azure()
  File "/home/user/Desktop/dev/ALL DEV CODE/hybrid-sql-agent/backend/ai/documents_processing.py", line 486, in process_and_index_data_to_azure
    acs_documents = configure_azure_search_for_documents()
  File "/home/user/Desktop/dev/ALL DEV CODE/hybrid-sql-agent/backend/ai/documents_processing.py", line 406, in configure_azure_search_for_documents
    vector_store: AzureSearch = AzureSearch(
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/langchain_community/vectorstores/azuresearch.py", line 268, in __init__
    self.client = _get_search_client(
  File "/home/user/Desktop/dev/env_sqllatest/lib/python3.10/site-packages/langchain_community/vectorstores/azuresearch.py", line 144, in _get_search_client
    vector_search = VectorSearch(
NameError: name 'VectorSearch' is not defined. Did you mean: 'vector_search'?

Description

My env spec: i have used latested version of langchain and azure's SDK everything is working but when i try to create index it give me this error

langchain==0.1.7
langchain-community==0.0.20
langchain-core==0.1.23
langchain-openai==0.0.6

azure-common==1.1.28
azure-core==1.30.0
azure-identity==1.15.0
azure-search-documents==11.4.0
azure-storage-blob==12.19.0

fastapi==0.109.2
uvicorn==0.27.1
python-dotenv==1.0.1
pandas==2.2.0

unstructured==0.12.4
python-docx==1.1.0
unstructured[pdf]

System Info

`System Information

OS: Linux OS Version: #18~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 7 11:40:03 UTC 2 Python Version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]

Package Information

langchain_core: 0.1.23 langchain: 0.1.7 langchain_community: 0.0.20 langsmith: 0.0.87 langchain_openai: 0.0.6

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve `

pip freeze output: aiohttp==3.9.3 aiosignal==1.3.1 annotated-types==0.6.0 antlr4-python3-runtime==4.9.3 anyio==4.3.0 async-timeout==4.0.3 attrs==23.2.0 azure-common==1.1.28 azure-core==1.30.0 azure-identity==1.15.0 azure-search-documents==11.4.0 azure-storage-blob==12.19.0 backoff==2.2.1 beautifulsoup4==4.12.3 certifi==2024.2.2 cffi==1.16.0 chardet==5.2.0 charset-normalizer==3.3.2 click==8.1.7 coloredlogs==15.0.1 contourpy==1.2.0 cryptography==42.0.3 cycler==0.12.1 dataclasses-json==0.6.4 dataclasses-json-speakeasy==0.5.11 Deprecated==1.2.14 distro==1.9.0 effdet==0.4.1 emoji==2.10.1 exceptiongroup==1.2.0 fastapi==0.109.2 filelock==3.13.1 filetype==1.2.0 flatbuffers==23.5.26 fonttools==4.49.0 frozenlist==1.4.1 fsspec==2024.2.0 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.3 httpx==0.26.0 huggingface-hub==0.20.3 humanfriendly==10.0 idna==3.6 iopath==0.1.10 isodate==0.6.1 Jinja2==3.1.3 joblib==1.3.2 jsonpatch==1.33 jsonpath-python==1.0.6 jsonpointer==2.4 kiwisolver==1.4.5 langchain==0.1.7 langchain-community==0.0.20 langchain-core==0.1.23 langchain-openai==0.0.6 langdetect==1.0.9 langsmith==0.0.87 layoutparser==0.3.4 lxml==5.1.0 MarkupSafe==2.1.5 marshmallow==3.20.2 matplotlib==3.8.3 mpmath==1.3.0 msal==1.26.0 msal-extensions==1.1.0 multidict==6.0.5 mypy-extensions==1.0.0 networkx==3.2.1 nltk==3.8.1 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.19.3 nvidia-nvjitlink-cu12==12.3.101 nvidia-nvtx-cu12==12.1.105 omegaconf==2.3.0 onnx==1.15.0 onnxruntime==1.15.1 openai==1.12.0 opencv-python==4.9.0.80 packaging==23.2 pandas==2.2.0 pdf2image==1.17.0 pdfminer.six==20221105 pdfplumber==0.10.4 pikepdf==8.13.0 pillow==10.2.0 pillow_heif==0.15.0 portalocker==2.8.2 protobuf==4.25.3 pycocotools==2.0.7 pycparser==2.21 pydantic==2.6.1 pydantic_core==2.16.2 PyJWT==2.8.0 pyparsing==3.1.1 pypdf==4.0.2 pypdfium2==4.27.0 pytesseract==0.3.10 python-dateutil==2.8.2 python-docx==1.1.0 python-dotenv==1.0.1 python-iso639==2024.2.7 python-magic==0.4.27 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 rapidfuzz==3.6.1 regex==2023.12.25 requests==2.31.0 safetensors==0.4.2 scipy==1.12.0 six==1.16.0 sniffio==1.3.0 soupsieve==2.5 SQLAlchemy==2.0.27 starlette==0.36.3 sympy==1.12 tabulate==0.9.0 tenacity==8.2.3 tiktoken==0.6.0 timm==0.9.12 tokenizers==0.15.2 torch==2.2.0 torchvision==0.17.0 tqdm==4.66.2 transformers==4.37.2 triton==2.2.0 typing-inspect==0.9.0 typing_extensions==4.9.0 tzdata==2024.1 unstructured==0.12.4 unstructured-client==0.18.0 unstructured-inference==0.7.23 unstructured.pytesseract==0.3.12 urllib3==2.2.1 uvicorn==0.27.1 wrapt==1.16.0 yarl==1.9.4

dosubot[bot] commented 8 months ago

šŸ¤–

Hey @Farid-Ullah! Great to see you diving into the depths of LangChain and AzureSearch again. Hope this puzzle turns out to be more fun than frustrating! šŸš€

Based on the issues you've encountered and the context provided, it seems like you're facing a couple of problems related to the VectorSearch class and the vector_search_configuration parameter.

The NameError for VectorSearch is likely due to the fact that the VectorSearch class is only imported if TYPE_CHECKING is True. This is a common pattern used to avoid circular imports in Python, but it means that the VectorSearch class is not actually available at runtime. To fix this issue, you need to move the import statement for VectorSearch outside of the if TYPE_CHECKING: block. Here's how you can do it:

from azure.search.documents.indexes.models import VectorSearch

if TYPE_CHECKING:
    from azure.search.documents import SearchClient
    from azure.search.documents.indexes.models import (
        CorsOptions,
        ScoringProfile,
        SearchField,
        SemanticConfiguration,
    )

This way, VectorSearch will be imported regardless of whether type checking is being performed or not, and the NameError should no longer occur.

Regarding the InvalidRequestParameter error, it seems like there was a change in the Azure SDK where the vector_search_configuration parameter was renamed to vector_search_profile_name. You should update your code to use vector_search_profile_name="myHnswProfile" instead of vector_search_configuration to specify the algorithm to use when searching the vector field.

These solutions were suggested in the following issues and pull requests:

Please try these solutions and let us know if you encounter any further issues.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
Farid-Ullah commented 8 months ago

@dosubot i know your struggle is appreciatable but i need some member look into this issue.

dosubot[bot] commented 8 months ago

šŸ¤–

I understand your concern, but as an AI language model, I don't have the ability to directly involve other team members. However, I can certainly help you with the issue you're facing. Please feel free to describe the problem, and I'll do my best to assist you.

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.