Azure ML Pipeline with V2 SDK

shishirdash24 commented 1 year ago

Previously microsoft suggested us to learn Azure ML Pipeline (using SDK V1) & then use it for our model creation. Link for the training is https://learn.microsoft.com/en-us/training/paths/build-ai-solutions-with-azure-ml-service/.

The process that we followed:

Azure Data Factory generates a new file every month for all the models & publishes in Blob storage.
We have created and published Azure ML Pipeline for models which gets data from Blob, preprocess the data, train model, register the model, finds feature importance, finds data drift.
Then we have used the Azure ML Pipeline ID in DevOps build pipeline, so that the ML Pipeline can be triggered through build Pipeline (The build pipeline gets triggered through Azure Logic App, whenever a new file is published to the Blob container).
Post DevOps pipeline's completion, release pipeline deploys the model into ACI & AKS.

Now we have been suggested to use SDK V2 for all our model training & other processes. Please suggest, how can we perform all the above steps using SDK V2. All the microsoft documents are incomplete to answer this. As SDK V1 is legacy now, we are bound to move our code to V2. But SDK V2 examples are incomplete to address our issues.

gcoyle83 commented 1 year ago

Yes, it would be great if all of the examples in this repo were updated to include how to do these things with the new SDK.

diondrapeck commented 1 year ago

Hi @shishirdash24 and @gcoyle83, examples for the new SDK can be found in the azureml-examples repository. Specifically, here's the link to the folder with pipeline notebook examples: https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines.

shishirdash24 commented 1 year ago

Thank you for sharing all the examples @diondrapeck. I went through many examples but couldn't find a single one which has below codes:

get run context in the jobs (.py files) which will allow to connect to the AmlCompute of parent script. In V1 examples, we were able to get the context of parent script by using the below code

Get context of the execution

run = Run.get_context()

Find the connection to workspace

workspace = run.experiment.workspace
None of the examples shows how to publish the V2 pipeline. The V1 ML pipelines are actually published and is being used in DevOps build pipeline for auto execution of the V1 ML pipelines.

Azure / MachineLearningNotebooks

Azure ML Pipeline with V2 SDK #1882

Get context of the execution

Find the connection to workspace