Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.53k stars 2.76k forks source link

Do not register named data asset when AzureML component fails #32578

Open grzjab opened 10 months ago

grzjab commented 10 months ago

Is your feature request related to a problem? Please describe. When using AzureML pipelines do not register named data assets when job fails. In the current AzureML setup, when the pipeline step fails, named outputs are still registered.

Describe the solution you'd like I would like to have an option to register named dataset only if the pipeline component successfully finished.

Additional context Assume I have a component that can fail during the execution:

from pathlib import Path
import argparse
import pandas as pd

parser = argparse.ArgumentParser()

parser.add_argument("--output_data", dest='output_data',
                    type=str)
# parse args
args = parser.parse_args()
raise Exception
# save data
output_df = df.to_csv((Path(args.output_data) / "out.csv"),)

With the following component configuration:

$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
name: fail_step
display_name: Fail
version: 1
type: command
outputs:
  output_data:
    type: uri_folder
code: ./src
environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest
command: >-
  python fail.py 
  --output_data ${{outputs.output_data}}

When I execute the pipeline, the test-dataset-failed data asset will be registered even though it is empty:

from azure.ai.ml.dsl import pipeline
from azure.ai.ml import load_component

fail = load_component("fail.yml")

@pipeline()
def failing_pipeline():
    fail_step = fail()

    return {
        "output_data": fail_step.outputs.output_data,
    }

pipeline_job = failing_pipeline()

# change the output mode
pipeline_job.outputs.output_data.mode = "upload"
pipeline_job.outputs.output_data.name = "test-dataset-failed"

# submit job to workspace
pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="pipeline_test_fail"
)
pipeline_job

The test-dataset-failed data asset was registered as empty with version = 1

image
github-actions[bot] commented 10 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github @Azure/azure-ml-sdk.

catalinaperalta commented 10 months ago

Thanks for reaching out @grzjab! Looping in @azureml-github to investigate