microsoft / promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
https://microsoft.github.io/promptflow/
MIT License
8.66k stars 771 forks source link

[BUG] AI Search connection in `mlindex_content` not detected #3313

Closed adamdougal closed 1 day ago

adamdougal commented 1 month ago

Describe the bug I have a Multi-Round Q&A on Your Data chat flow which queries Azure AI Search and passes the results, along with the chat history and question to Azure OpenAI. After exporting the flow, with the intention of deploying to Azure App Service. I am unable to successfully run a built flow (either as an executable or docker image). I believe this is due to the Azure AI Search connection being ignored.

When running pf flow serve the flow runs correctly and I am able to get responses.

I have tried this with prompt flow verisons 1.9.0, 1.10.0 and 1.11.0.

How To Reproduce the bug Steps to reproduce the behavior, how frequent can you experience the bug:

  1. Create a Multi-Round Q&A on your data chat flow.
  2. Export the flow locally
  3. Install the requirements pip install -r requirements.txt
  4. Create the AI Search and Open AI connections files
  5. Add the connections to prompt flow pf connection create -f <file>
  6. Build the flow pf flow build --source . --output dist-docker --format docker
  7. Inspect the dist-docker/connections directory and observe that the AI Search connection is missing

Expected behavior The AI search connection is picked up and connection file created.

Screenshots If applicable, add screenshots to help explain your problem.

Running Information(please complete the following information):

Additional context Example files, click to expand:

A shorterned and redacted example prompt flow ```yaml id: bring_your_own_data_chat_qna name: Bring Your Own Data Chat QnA inputs: chat_history: type: list default: - inputs: chat_input: Hi outputs: chat_output: Hello! How can I assist you today? - inputs: chat_input: What is Azure compute instance? outputs: chat_output: An Azure Machine Learning compute instance is a fully managed cloud-based workstation for data scientists. It provides a pre-configured and managed development environment in the cloud for machine learning. Compute instances can also be used as a compute target for training and inferencing for development and testing purposes. They have a job queue, run jobs securely in a virtual network environment, and can run multiple small jobs in parallel. Additionally, compute instances support single-node multi-GPU distributed training jobs. is_chat_input: false is_chat_history: true chat_input: type: string default: How can I create one using azureml sdk V2? is_chat_input: true outputs: chat_output: type: string reference: ${chat_with_context.output} is_chat_output: true nodes: - name: index_lookup type: python source: type: package tool: promptflow_vectordb.tool.common_index_lookup.search inputs: mlindex_content: > embeddings: api_base: https://.openai.azure.com/ api_type: azure api_version: '2024-02-01' batch_size: '1' connection: id: /subscriptions//resourceGroups//providers/Microsoft.MachineLearningServices/workspaces//connections/openai_connection connection_type: workspace_connection deployment: text-embedding-ada-002 dimension: 1536 kind: open_ai model: text-embedding-ada-002 schema_version: '2' index: api_version: '2023-11-01' connection: id: /subscriptions//resourceGroups//providers/Microsoft.MachineLearningServices/workspaces//connections/aisearch_connection connection_type: workspace_connection endpoint: https://.search.windows.net engine: azure-sdk field_mapping: content: content embedding: content_vector metadata: metadata index: kind: acs semantic_configuration_name: default queries: ${inputs.chat_input} query_type: Hybrid (vector + keyword) top_k: 2 use_variants: false - name: chat_with_context type: llm source: type: code path: chat_with_context.jinja2 inputs: deployment_name: gpt-35-turbo-16k temperature: 0 top_p: 1 max_tokens: 1000 presence_penalty: 0 frequency_penalty: 0 prompt_text: ${Prompt_variants.output} provider: AzureOpenAI connection: openai_connection api: chat module: promptflow.tools.aoai use_variants: false node_variants: {} environment: python_requirements_txt: requirements.txt ```
AI Search Connection ```yaml $schema: https://azuremlschemas.azureedge.net/promptflow/latest/CognitiveSearchConnection.schema.json name: aisearch_connection type: cognitive_search api_key: ${env:AISEARCH_CONNECTION_API_KEY} api_base: "https://.search.windows.net" api_version: "2023-03-15-preview" ```
adamdougal commented 1 month ago

I've been playing around with this test with my flow.dag.yml and it looks like where it's falling down is in flow.py:get_connection_names() where it does not inspect the mlindex_content.

I'm not sure if the code has changed to break this, if the generated flow.dag.yml interface has changed or perhaps this has never worked? Or maybe I'm doing something wrong!

adamdougal commented 1 month ago

Looks like there has been a change to this recently https://learn.microsoft.com/en-us/azure/machine-learning/prompt-flow/tools-reference/index-lookup-tool?view=azureml-api-2#how-to-migrate-from-legacy-tools-to-the-index-lookup-tool.

pgr-lopes commented 1 month ago

Pretty sure you hit the same problem I did: https://github.com/microsoft/promptflow/issues/2876

You have to do the metadata account set manually, for whatever reason the activity is not taking the config.json file into consideration:

az login az account set --subscription az configure --defaults group= workspace=

adamdougal commented 1 month ago

I've hardcoded a connection in the flow.dag.yml which has got me further.

I am now getting a response of:

{"error":{"code":"UserError","message":"Execution failure in 'index_lookup': (Exception) Exception occured in search_function_construction."}}

I have also deployed this to an AML endpoint using the deploy button in AML and get the same error there.

As it stands, from what I can tell, using Azure AI Search with Prompt Flow is currently unusable unless invoked from AML Studio.

brynn-code commented 1 month ago

The response didn't show the full error reason, you could reach the error detail from the app service's container logs. If the error caused by connection missing, I could explain more about the connections when deploy to app service.

When deploying to Azure App service, promptflow will use connection locally ( here the locally means the connections meta stored in local sqlite ), I took a look at your flow, there are 2 connections, the AI search one and the OpenAI one. If you wanna use Azure AI connections which stored in the Azure AI project, please set the connection provider config to let promptflow fetching Azure AI connections ( you may need to add command in the container startup script, also remember to add the app service as reader roles to your AI project to access the connection keys ). Refer to here for the connection config.

brynn-code commented 1 month ago

Usually we won't assume user is deploying app service with AzureAI connections, that will cause many problems, like the permission, the service principal role, balabala, so by default we will guide user setup connections again for there app service locally by setting the app service environment variables. The locally setup guides you could reach via the following documentation: https://microsoft.github.io/promptflow/cloud/azureai/deploy-to-azure-appservice.html#view-and-test-the-web-app

adamdougal commented 1 month ago

Heya, thanks for your response! Unfortunately, even after setting the connection.provider=local setting, it does not pick up the AI Search connection unless I manually add it to the flow.dag.yaml.

Regarding the Exception occured in search_function_construction error. I get this both locally and when the flow is deployed as an AML endpoint. Here is the exception from the logs:

[2024-05-21 07:14:14,056][flowinvoker][ERROR] - Flow run failed with error: {'message': "Execution failure in 'index_lookup': (Exception) Exception occured in search_function_construction.", 'messageFormat': "Execution failure in '{node_name}'.", 'messageParameters': {'node_name': 'index_lookup'}, 'referenceCode': 'Tool/promptflow_vectordb.tool.common_index_lookup', 'code': 'UserError', 'innerError': {'code': 'ToolExecutionError', 'innerError': None}, 'additionalInfo': [{'type': 'ToolExecutionErrorDetails', 'info': {'type': 'Exception', 'message': 'Exception occured in search_function_construction.', 'traceback': 'Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 18, in measure_execution_time
    yield
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup_extensions/utils.py", line 37, in build_search_func
    store = index.as_langchain_vectorstore()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azureml/rag/mlindex.py", line 212, in as_langchain_vectorstore
    return azuresearch.AzureSearch(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 268, in __init__
    self.client = _get_search_client(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 84, in _get_search_client
    from azure.search.documents.indexes.models import (
ImportError: cannot import name \'ExhaustiveKnnAlgorithmConfiguration\' from \'azure.search.documents.indexes.models\' (/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azure/search/documents/indexes/models/__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 470, in wrapped
    output = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/core/logging/utils.py", line 98, in wrapper
    res = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 125, in search
    search_func = _get_search_func(mlindex_content, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 21, in measure_execution_time
    raise Exception(error_msg) from e
Exception: Exception occured in search_function_construction.
', 'filename': '/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py', 'lineno': 21, 'name': 'measure_execution_time'}}], 'debugInfo': {'type': 'ToolExecutionError', 'message': "Execution failure in 'index_lookup': (Exception) Exception occured in search_function_construction.", 'stackTrace': '
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 1008, in _exec
    output, aggregation_inputs = self._exec_inner_with_trace(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 913, in _exec_inner_with_trace
    output, nodes_outputs = self._traverse_nodes(inputs, context)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 1189, in _traverse_nodes
    nodes_outputs, bypassed_nodes = self._submit_to_scheduler(context, inputs, batch_nodes)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 1244, in _submit_to_scheduler
    return scheduler.execute(self._line_timeout_sec)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 131, in execute
    raise e
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 113, in execute
    self._dag_manager.complete_nodes(self._collect_outputs(completed_futures))
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 160, in _collect_outputs
    each_node_result = each_future.result()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 181, in _exec_single_node_in_thread
    result = context.invoke_tool(node, f, kwargs=kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 90, in invoke_tool
    result = self._invoke_tool_inner(node, f, kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 206, in _invoke_tool_inner
    raise ToolExecutionError(node_name=node_name, module=module) from e
', 'innerException': {'type': 'Exception', 'message': 'Exception occured in search_function_construction.', 'stackTrace': '
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 182, in _invoke_tool_inner
    return f(**kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 470, in wrapped
    output = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/core/logging/utils.py", line 98, in wrapper
    res = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 125, in search
    search_func = _get_search_func(mlindex_content, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 21, in measure_execution_time
    raise Exception(error_msg) from e
', 'innerException': {'type': 'ImportError', 'message': "cannot import name 'ExhaustiveKnnAlgorithmConfiguration' from 'azure.search.documents.indexes.models' (/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azure/search/documents/indexes/models/__init__.py)", 'stackTrace': 'Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 18, in measure_execution_time
    yield
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup_extensions/utils.py", line 37, in build_search_func
    store = index.as_langchain_vectorstore()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azureml/rag/mlindex.py", line 212, in as_langchain_vectorstore
    return azuresearch.AzureSearch(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 268, in __init__
    self.client = _get_search_client(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 84, in _get_search_client
    from azure.search.documents.indexes.models import (
', 'innerException': None}}}}
github-actions[bot] commented 1 week ago

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!