microsoft / promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
https://microsoft.github.io/promptflow/
MIT License
9.22k stars 835 forks source link

[BUG] Failed to upload run to Azure AI Project #3773

Open xquyvu opened 5 days ago

xquyvu commented 5 days ago

Describe the bug Failed to upload run to Azure AI Project

How To Reproduce the bug Steps to reproduce the behavior, how frequent can you experience the bug:

I can reliably reporduce this with the following snippet:

import os

import dotenv
from promptflow.core import AzureOpenAIModelConfiguration
from promptflow.evals.evaluate import evaluate
from promptflow.evals.evaluators import (
                                         CoherenceEvaluator,
                                         GroundednessEvaluator,
                                         RelevanceEvaluator,
)

dotenv.load_dotenv()

# Initialize Azure OpenAI Connection with your environment variables
model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    azure_deployment=os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
)

azure_ai_project = {
    "subscription_id": "REDACTED",
    "resource_group_name": "REDACTED",
    "project_name": "REDACTED",
}

# Initialzing Relevance Evaluator
relevance_eval = RelevanceEvaluator(model_config)
coherence_eval = CoherenceEvaluator(model_config)
groundedness_eval = GroundednessEvaluator(model_config)

# Running Relevance Evaluator on single input row
result = evaluate(
    data='./evaluation/data.jsonl',
    evaluators={
        "relevance": relevance_eval,
        "coherence": coherence_eval,
        "groundedness": groundedness_eval,
    },
    azure_ai_project=azure_ai_project,
)

Here is the data.jsonl file:

{"question": "Which tent is the most waterproof?", "context": "From the our product list, the alpine explorer tent is the most waterproof. The Adventure Dining Table has higher weight.", "answer": "The Alpine Explorer Tent is the most waterproof."}

I have 3 evaluators, which produce the following stack trace:

Traceback (most recent call last):
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 153, in upload
    results = await asyncio.gather(*tasks)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 251, in _upload_logs
    await self._upload_local_file_to_blob(local_file, remote_file, target_datastore=CloudDatastore.ARTIFACT)
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 427, in _upload_local_file_to_blob
    await self._upload_single_blob(blob_client, local_file, target_datastore)
  File "<REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py", line 439, in _upload_single_blob
    await blob_client.upload_blob(f, overwrite=self.overwrite)
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\tracing\decorator_async.py", line 94, in wrapper_use_tracer
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\aio\_blob_client_async.py", line 588, in upload_blob
    return cast(Dict[str, Any], await upload_block_blob(**options))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\aio\_upload_helpers.py", line 85, in upload_block_blob
    response = cast(Dict[str, Any], await client.upload(
                                    ^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\tracing\decorator_async.py", line 94, in wrapper_use_tracer
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_generated\aio\operations\_block_blob_operations.py", line 243, in upload
    pipeline_response: PipelineResponse = await self._client._pipeline.run(  # pylint: disable=protected-access
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 219, in run
    return await first_node.send(pipeline_request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 3 more times]
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\policies\_authentication_async.py", line 95, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\policies\_redirect_async.py", line 73, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 114, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 66, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\_base_async.py", line 104, in send
    await self._sender.send(request.http_request, **request.context.options),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\storage\blob\_shared\base_client_async.py", line 268, in send
    return await self._transport.send(request, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\azure\core\pipeline\transport\_aiohttp.py", line 303, in send
    result = await self.session.request(  # type: ignore
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\client.py", line 657, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 564, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 975, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 1285, in _create_direct_connection
    sslcontext = await self._get_ssl_context(req)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 1028, in _get_ssl_context
    return await self._make_or_get_ssl_context(True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<REDACTED>\.venv\Lib\site-packages\aiohttp\connector.py", line 1033, in _make_or_get_ssl_context
    return await self._made_ssl_context[verified]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Task <Task pending name='Task-47' coro=<AsyncRunUploader._upload_logs() running at <REDACTED>\.venv\Lib\site-packages\promptflow\azure\operations\_async_run_uploader.py:251> cb=[gather.<locals>._done_callback() at REDACTED\miniconda3\Lib\asyncio\tasks.py:764]> got Future <Future pending> attached to a different loop

Expected behavior No error raised, and the results can be viewed on Azure AI Studio

Screenshots N/A

Running Information(please complete the following information):

Executable '.venv\Scripts\python.exe' Python (Windows) 3.11.9 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:40:41) [MSC v.1916 64 bit (AMD64)]

 - Operating System: Windows 11
 - Python Version using `python --version`: 3.11.9

**Additional context**

After this error is raised, trying to run the evaluation again in REPL throws the following error:
```python
ValueError: Failed to load data from ./evaluation/data.jsonl. Please validate it is a valid jsonl data. Error: Expected object or value.

After restarting the shell, the ValueError goes away (but then I got hit with the above error.

luigiw commented 3 days ago

The error indicates upload to storage account failed. Am wondering if this happens consistently?

The second error is a know issue that promptflow upon error changes cwd.