microsoft / promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
https://microsoft.github.io/promptflow/
MIT License
9.49k stars 869 forks source link

[BUG] Failed to upload run to Azure AI Project #3773

Open xquyvu opened 1 month ago

xquyvu commented 1 month ago

Describe the bug Failed to upload run to Azure AI Project

How To Reproduce the bug Steps to reproduce the behavior, how frequent can you experience the bug:

I can reliably reporduce this with the following snippet:

import os

import dotenv
from promptflow.core import AzureOpenAIModelConfiguration
from promptflow.evals.evaluate import evaluate
from promptflow.evals.evaluators import (
                                         CoherenceEvaluator,
                                         GroundednessEvaluator,
                                         RelevanceEvaluator,
)

dotenv.load_dotenv()

# Initialize Azure OpenAI Connection with your environment variables
model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    azure_deployment=os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
)

azure_ai_project = {
    "subscription_id": "REDACTED",
    "resource_group_name": "REDACTED",
    "project_name": "REDACTED",
}

# Initialzing Relevance Evaluator
relevance_eval = RelevanceEvaluator(model_config)
coherence_eval = CoherenceEvaluator(model_config)
groundedness_eval = GroundednessEvaluator(model_config)

# Running Relevance Evaluator on single input row
result = evaluate(
    data='./evaluation/data.jsonl',
    evaluators={
        "relevance": relevance_eval,
        "coherence": coherence_eval,
        "groundedness": groundedness_eval,
    },
    azure_ai_project=azure_ai_project,
)

Here is the data.jsonl file:

{"question": "Which tent is the most waterproof?", "context": "From the our product list, the alpine explorer tent is the most waterproof. The Adventure Dining Table has higher weight.", "answer": "The Alpine Explorer Tent is the most waterproof."}

I have 3 evaluators, which produce the following stack trace:

Traceback (most recent call last):
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 153, in upload
    results = await asyncio.gather(*tasks)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 251, in _upload_logs
    await self._upload_local_file_to_blob(local_file, remote_file, target_datastore=CloudDatastore.ARTIFACT)
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 427, in _upload_local_file_to_blob
    await self._upload_single_blob(blob_client, local_file, target_datastore)
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 439, in _upload_single_blob
    await blob_client.upload_blob(f, overwrite=self.overwrite)
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\tracing\decorator_async.py", line 94, in wrapper_use_tracer
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\aio\_blob_client_async.py", line 588, in upload_blob
    return cast(Dict[str, Any], await upload_block_blob(**options))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\aio\_upload_helpers.py", line 85, in upload_block_blob
    response = cast(Dict[str, Any], await client.upload(
                                    ^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\tracing\decorator_async.py", line 94, in wrapper_use_tracer
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_generated\aio\operations\_block_blob_operations.py", line 243, in upload
    pipeline_response: PipelineResponse = await self._client._pipeline.run(  # pylint: disable=protected-access
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 219, in run
    return await first_node.send(pipeline_request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 3 more times]
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\policies\_authentication_async.py", line 95, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\policies\_redirect_async.py", line 73, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 114, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 66, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 104, in send
    await self._sender.send(request.http_request, **request.context.options),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_shared\base_client_async.py", line 268, in send
    return await self._transport.send(request, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\transport\_aiohttp.py", line 303, in send
    result = await self.session.request(  # type: ignore
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\client.py", line 657, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 564, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 975, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 1285, in _create_direct_connection
    sslcontext = await self._get_ssl_context(req)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 1028, in _get_ssl_context
    return await self._make_or_get_ssl_context(True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 1033, in _make_or_get_ssl_context
    return await self._made_ssl_context[verified]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Task <Task pending name='Task-47' coro=<AsyncRunUploader._upload_logs() running at <REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py:251> cb=[gather.<locals>._done_callback() at REDACTED\miniconda3\Lib\asyncio\tasks.py:764]> got Future <Future pending> attached to a different loop

Expected behavior No error raised, and the results can be viewed on Azure AI Studio

Screenshots N/A

Running Information(please complete the following information):

Executable '.venv\Scripts\python.exe' Python (Windows) 3.11.9 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:40:41) [MSC v.1916 64 bit (AMD64)]

 - Operating System: Windows 11
 - Python Version using `python --version`: 3.11.9

**Additional context**

After this error is raised, trying to run the evaluation again in REPL throws the following error:
```python
ValueError: Failed to load data from ./evaluation/data.jsonl. Please validate it is a valid jsonl data. Error: Expected object or value.

After restarting the shell, the ValueError goes away (but then I got hit with the above error.

luigiw commented 1 month ago

The error indicates upload to storage account failed. Am wondering if this happens consistently?

The second error is a know issue that promptflow upon error changes cwd.

github-actions[bot] commented 3 weeks ago

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!

xquyvu commented 3 weeks ago

@luigiw Hi, yes this happens every time. In fact I have never got this to work.