Azure / azure-cli

Azure Command-Line Interface
MIT License
3.91k stars 2.88k forks source link

'az ml model download' cli command does not work as expected #28302

Closed ArPharazon closed 4 months ago

ArPharazon commented 5 months ago

Describe the bug

The 'az ml model download' cli command does not download any files to the local machine.

The azure cli command logger output seems to indicate that the download has started, but there is no 'success' or completion messages in the output.

Example: (fill in vars as needed)

az ml model download \
  --name "${MODEL_NAME}" \
  --version "${MODEL_VERSION}" \
  --download-path "${MODEL_PATH}" \
  --resource-group "${RESOURCE_GROUP}" \
  --workspace-name "${WORKSPACE_NAME}" \
  --subscription "${SUBSCRIPTION}" \
  --debug

Related command

az ml model download

Errors

There does not seem to be any error message. The command output indicates that it has worked, but the files do not get downloaded.

$ az ml model download \
>         --name "${AML_MODEL_NAME}" \
>         --version "${AML_MODEL_VERSION}" \
>         --download-path "./models" \      
>         --resource-group "${AML_RESOURCE_GROUP}" \
>         --workspace-name "${AML_WORKSPACE_NAME}" \
>         --subscription "${AML_SUBSCRIPTION_ID}"
Downloading the model azureml/XXXXXXX/mlflow_model_artifacts/ at ./models\YYYYYY

Issue script & Debug output

DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): REDACTED.blob.core.windows.net:443
DEBUG: urllib3.connectionpool: https://REDACTED.blob.core.windows.net:443 "HEAD /azureml/azureml/quiet_iron_dbx6z8qc3x/mlflow_model_artifacts/ HTTP/1.1" 
404 0
DEBUG: urllib3.connectionpool: https://REDACTED.blob.core.windows.net:443 "GET /azureml?restype=container&comp=list&prefix=azureml%2Fquiet_iron_dbx6z8qc3x%2Fmlflow_model_artifacts%2F&delimiter=%2F HTTP/1.1" 200 None
Downloading the model azureml/quiet_iron_dbx6z8qc3x/mlflow_model_artifacts/ at ./models\MODELNAME

DEBUG: urllib3.connectionpool: https://REDACTED.blob.core.windows.net:443 "GET /azureml?restype=container&comp=list&prefix=azureml%2Fquiet_iron_dbx6z8qc3x%2Fmlflow_model_artifacts%2F&include=metadata HTTP/1.1" 200 None
DEBUG: cli.knack.cli: Event: CommandInvoker.OnTransformResult [<function _resource_group_transform at 0x0000017425FA9760>, <function _x509_from_base64_to_hex_transform at 0x0000017425FA9800>]
DEBUG: cli.knack.cli: Event: CommandInvoker.OnFilterResult []
DEBUG: cli.knack.cli: Event: Cli.SuccessfulExecute []
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x0000017425F6E480>]
INFO: az_command_data_logger: exit code: 0
INFO: cli.__main__: Command ran in 5.135 seconds (init: 0.310, invoke: 4.825)
INFO: telemetry.main: Begin splitting cli events and extra events, total events: 1
INFO: telemetry.client: Accumulated 0 events. Flush the clients.
INFO: telemetry.main: Finish splitting cli events and extra events, cli events: 1
INFO: telemetry.save: Save telemetry record of length 3417 in cache
INFO: telemetry.main: Begin creating telemetry upload process.
INFO: telemetry.process: Creating upload process: "C:\Program Files\Microsoft SDKs\Azure\CLI2\python.exe C:\Program Files\Microsoft SDKs\Azure\CLI2\Lib\site-packages\azure\cli\telemetry\__init__.pyc C:\Users\USER\.azure"
INFO: telemetry.process: Return from creating process
INFO: telemetry.main: Finish creating telemetry upload process.

Expected behavior

The model artefact files should be downloaded from the AML workspace to the local machine.

Environment Summary

{
  "azure-cli": "2.56.0",
  "azure-cli-core": "2.56.0",
  "azure-cli-telemetry": "1.1.0",
  "extensions": {
    "account": "0.2.5",
    "ai-examples": "0.2.5",
    "arcdata": "1.5.9",
    "azure-devops": "0.26.0",
    "connectedk8s": "1.6.2",
    "databricks": "0.10.2",
    "k8s-extension": "1.5.3",
    "ml": "2.22.0"
  }
}

Additional context

Azure CLI command logger output:

CMD-LOG-LINE-BEGIN 38656 | 2024-02-05 22:09:48,650 | INFO | az_command_data_logger | command args: ml model download --name {} --version {} -p {} --resource-group {} --workspace-name {} --subscription {} --debug
CMD-LOG-LINE-BEGIN 38656 | 2024-02-05 22:09:48,670 | INFO | az_command_data_logger | extension name: ml
CMD-LOG-LINE-BEGIN 38656 | 2024-02-05 22:09:48,670 | INFO | az_command_data_logger | extension version: 2.22.0
CMD-LOG-LINE-BEGIN 38656 | 2024-02-05 22:09:51,242 | INFO | az_command_data_logger | exit code: 0
yonzhan commented 5 months ago

Thank you for opening this issue, we will look into it.

microsoft-github-policy-service[bot] commented 5 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.

banibrata-de commented 4 months ago

Could you share approximate contents of this https://REDACTED.blob.core.windows.net:/zureml/quiet_iron_dbx6z8qc/mlflow_model_artifacts , it seems this is from where it is trying to download. Want to make sure files are present there. Also could you try to download it from the aml studio portal?

ArPharazon commented 4 months ago

We have noticed that the 'az ml model download' command only works when the model version is not linked to an experiment / job run.

In this screenshot below, the 'az ml model download' command works fine for versions 1-3, but it does not work with the rest:

az-ml-model-download-issues

Versions 1 to 3 have a 'Download all' button:

download-button-is-present

Versions 4 to 9 do not:

download-button-is-missing

We can of course navigate to the 'Artifacts' tab, and manually download the individual files as need, but this defeats our CICD process.

ArPharazon commented 4 months ago

Could you share approximate contents of this https://REDACTED.blob.core.windows.net:/zureml/quiet_iron_dbx6z8qc/mlflow_model_artifacts , it seems this is from where it is trying to download. Want to make sure files are present there. Also could you try to download it from the aml studio portal?

Yes, the files are present in the storage folder and they can be downloaded individually from the portal by navigating to the 'Artifacts' tab only.

image

ArPharazon commented 4 months ago

Any news?

banibrata-de commented 4 months ago

We have added a bug to dig more into this, however right now there is no timeline to it, sorry about it. But we think at least users can still download it from the blob storage for some of the old models where it is not visible. If you think this is big blocker and affecting a large number of models, for you CI-CD , I will encourage to open an Azure customer ticket, so we can prioritize.

ArPharazon commented 4 months ago

I have raised an internal microsoft support ticket about this problem. It is a blocker for automated CICD.