Azure / MachineLearningNotebooks

Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
https://docs.microsoft.com/azure/machine-learning/service/
MIT License
4.09k stars 2.52k forks source link

How to submit Dataset Input as a Parameter to AZ ML CLI run submit-pipeline command? #1420

Open anirbansaha96 opened 3 years ago

anirbansaha96 commented 3 years ago

Refers to: https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-showcasing-dataset-and-pipelineparameter.ipynb

To submit a parameter in an az ml cli run submit-pipeline command we use the syntax:

az ml run submit-pipeline –datapaths [DataPATHS Name=datastore/datapath] --experiment-name [Experiment_Name] --parameters [String_parameters Name=Value] --pipeline-id [ID]--resource-group [RGP] --subscription-id [SUB_ID] --workspace-name [AML_WS_NAME]

This will submit Datapaths and some string parameters with the pipeline. How do we submit Dataset references using az ml cli run submit-pipeline command?

For example, the Documentation Notebook: aml-pipelines-showcasing-dataset-and-pipelineparameter

To submit a Dataset Class reference we do:

iris_tabular_ds = Dataset.Tabular.from_delimited_files('link/iris.csv')
pipeline_run_with_params = experiment.submit(pipeline, pipeline_parameters={'tabular_ds_param': iris_tabular_ds})

Using REST Call the syntax is:

response = requests.post(rest_endpoint, 
                         headers=aad_token, 
                         json={"ExperimentName": "MyRestPipeline",
                               "RunSource": "SDK",
                               "DataSetDefinitionValueAssignments": { "tabular_ds_param": {"SavedDataSetReference": {"Id": iris_tabular_ds.id}}}
                              }
                        )

What is the syntax to achieve this using az ml cli?

az ml run submit-pipeline --datapaths tabular_ds_param=[datastore]/[registered-dataset] --experiment-name [exp-name]-exp --pipeline-id [pipeline-id] --resource-group $(AML_RG) --subscription-id $(AML_SUB_ID) --workspace-name $(AML_WS) does not work.

anirbansaha96 commented 3 years ago

The workaround I use is:

curl -X POST [Pipeline_REST_Endpoint] -H "Authorization: Bearer $(az account get-access-token --query accessToken -o tsv)" -H "Content-Type: application/json" --data-binary @- <<DATA
{"ExperimentName": "[Exmperiment_NAME]",
                               "RunSource": "SDK",
                               "DataSetDefinitionValueAssignments": {"tabular_ds_param": 
                                                                     {"SavedDataSetReference": 
                                                                      {"Id":"[Dataset_ID]"}
                                                                     }
                                                                    }
                              }
DATA