Training Module: Train models in Azure Machine Learning with the CLI (v2)
Unit: Create Azure Machine Learning resources with the CLI (v2)
Section: Create a dataset asset
Hi, I'm creating this issue here because the original page in MS Learn does not allow me to submit feedback more than once. As I already submitted feedback on one typo, here it goes the new feedback.
Versions:
azure-cli: 2.59.0
extension ml: 2.25.1
1. What appears an obsolete command
The command az ml dataset list throws this error: 'dataset' is misspelled or not recognized by the system. Did you mean 'datastore' ?. It looks like the command is obsolete now, as to work it should say data, not dataset, like az ml data list. Same goes for the command above it, az ml dataset create --file data-local-path.yml.
2. What appears an obsolete YAML file for data asset properties
In the same page as in the point above, the YAML that specifies the dataset asset properties didn't work due to two typos:
(yellow in image) Wrong parameter name in the YAML
When I ran the command az ml data create --file data-local-path.yml I got the error:
(x) path:
Missing data for required field.
(x) local_path:
Unknown field.
So looks like `path` replaced `local_path`.
(blue in image) Wrong value for the path parameter: Having saved a CSV named "customer-churn.csv" at the same level that data-local-path.yml, when running az ml data create --file data-local-path.yml I got the error:
(x) File path does not match asset type uri_folder: /mnt/batch/.../customer-churn.csv
Thus it seems it expects a URI folder and not a path to the final data file. I fixed that by putting the customer-churn.csv file in a folder "datasets" and then specifying just this folder inside the data-local-path.yml YAML (i.e., path: datasets). Then a re-run of the az ml data create command worked fine:
Conclusion
In summary, the YAML specs and the command that work are:
The local data file inside a folder
datasets/
└── customer-churn.csv
The YAML file with the path field and the value specifying the path to a folder (not a file):
# data-local-path.yml
$schema: https://azuremlschemas.azureedge.net/latest/asset.schema.json
name: customer-churn-data
version: 1
path: datasets
description: Dataset pointing to customer churn CSV on local computer. Data will be uploaded to default datastore
The corrected Azure CLI command with data instead of dataset:
Training Module: Train models in Azure Machine Learning with the CLI (v2)
Unit: Create Azure Machine Learning resources with the CLI (v2)
Section: Create a dataset asset
Versions:
azure-cli
: 2.59.0ml
: 2.25.11. What appears an obsolete command
The command
az ml dataset list
throws this error:'dataset' is misspelled or not recognized by the system. Did you mean 'datastore' ?
. It looks like the command is obsolete now, as to work it should say data, not dataset, likeaz ml data list
. Same goes for the command above it,az ml dataset create --file data-local-path.yml
.2. What appears an obsolete YAML file for data asset properties
In the same page as in the point above, the YAML that specifies the dataset asset properties didn't work due to two typos:
az ml data create --file data-local-path.yml
I got the error:(x) local_path:
Unknown field.
(blue in image) Wrong value for the
path
parameter: Having saved a CSV named"customer-churn.csv"
at the same level thatdata-local-path.yml
, when runningaz ml data create --file data-local-path.yml
I got the error:Thus it seems it expects a URI folder and not a path to the final data file. I fixed that by putting the
customer-churn.csv
file in a folder"datasets"
and then specifying just this folder inside thedata-local-path.yml
YAML (i.e.,path: datasets
). Then a re-run of theaz ml data create
command worked fine:Conclusion
In summary, the YAML specs and the command that work are:
The local data file inside a folder
The YAML file with the path field and the value specifying the path to a folder (not a file):
The corrected Azure CLI command with data instead of dataset:
az ml data create --file aml_data_asset.yml