This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.53k
stars
2.76k
forks
source link
Do not register named data asset when AzureML component fails #32578
Is your feature request related to a problem? Please describe.
When using AzureML pipelines do not register named data assets when job fails. In the current AzureML setup, when the pipeline step fails, named outputs are still registered.
Describe the solution you'd like
I would like to have an option to register named dataset only if the pipeline component successfully finished.
Additional context
Assume I have a component that can fail during the execution:
from pathlib import Path
import argparse
import pandas as pd
parser = argparse.ArgumentParser()
parser.add_argument("--output_data", dest='output_data',
type=str)
# parse args
args = parser.parse_args()
raise Exception
# save data
output_df = df.to_csv((Path(args.output_data) / "out.csv"),)
Is your feature request related to a problem? Please describe. When using AzureML pipelines do not register named data assets when job fails. In the current AzureML setup, when the pipeline step fails, named outputs are still registered.
Describe the solution you'd like I would like to have an option to register named dataset only if the pipeline component successfully finished.
Additional context Assume I have a component that can fail during the execution:
With the following component configuration:
When I execute the pipeline, the test-dataset-failed data asset will be registered even though it is empty:
The test-dataset-failed data asset was registered as empty with version = 1