Azure / MachineLearningNotebooks

Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
https://docs.microsoft.com/azure/machine-learning/service/
MIT License
4k stars 2.49k forks source link

Azure ML Pipeline with V2 SDK #1882

Open shishirdash24 opened 1 year ago

shishirdash24 commented 1 year ago

Previously microsoft suggested us to learn Azure ML Pipeline (using SDK V1) & then use it for our model creation. Link for the training is https://learn.microsoft.com/en-us/training/paths/build-ai-solutions-with-azure-ml-service/.

The process that we followed:

  1. Azure Data Factory generates a new file every month for all the models & publishes in Blob storage.
  2. We have created and published Azure ML Pipeline for models which gets data from Blob, preprocess the data, train model, register the model, finds feature importance, finds data drift.
  3. Then we have used the Azure ML Pipeline ID in DevOps build pipeline, so that the ML Pipeline can be triggered through build Pipeline (The build pipeline gets triggered through Azure Logic App, whenever a new file is published to the Blob container).
  4. Post DevOps pipeline's completion, release pipeline deploys the model into ACI & AKS.

Now we have been suggested to use SDK V2 for all our model training & other processes. Please suggest, how can we perform all the above steps using SDK V2. All the microsoft documents are incomplete to answer this. As SDK V1 is legacy now, we are bound to move our code to V2. But SDK V2 examples are incomplete to address our issues.

gcoyle83 commented 1 year ago

Yes, it would be great if all of the examples in this repo were updated to include how to do these things with the new SDK.

diondrapeck commented 1 year ago

Hi @shishirdash24 and @gcoyle83, examples for the new SDK can be found in the azureml-examples repository. Specifically, here's the link to the folder with pipeline notebook examples: https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines.

shishirdash24 commented 1 year ago

Thank you for sharing all the examples @diondrapeck. I went through many examples but couldn't find a single one which has below codes:

  1. get run context in the jobs (.py files) which will allow to connect to the AmlCompute of parent script. In V1 examples, we were able to get the context of parent script by using the below code

    Get context of the execution

    run = Run.get_context()

    Find the connection to workspace

    workspace = run.experiment.workspace

  2. None of the examples shows how to publish the V2 pipeline. The V1 ML pipelines are actually published and is being used in DevOps build pipeline for auto execution of the V1 ML pipelines.