Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.53k stars 2.76k forks source link

Dataset linage in AML #32014

Open ChrisMolanus opened 1 year ago

ChrisMolanus commented 1 year ago

It is not clear how to get the AML run that produced a dataset from the Python API. This data exists since it can be viewed in the UI.

Ideally this linage could be traced back to even the first job and dataset in the dag of jobs.

This would be used to find root causes for anomalies/errors to the earliest datasets where it should be detected.

github-actions[bot] commented 1 year ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github @Azure/azure-ml-sdk.

hugoaponte commented 1 year ago

Thanks for sharing this feedback.

To make sure I follow, your feature request is to provide a method in the Python AzureML SDK to retrieve the run that produced a data asset.

If so, you are right, this data is available on AzureML Studio Portal.

ChrisMolanus commented 1 year ago

That would be awesome!

hugoaponte commented 1 year ago

Got it. I filed a feature request which should be considered in our next planning cycle.

To set the expectation, it may take some time to be delivered.

Again, thanks a lot for the feedback.

ChrisMolanus commented 10 months ago

So........?