Azure / azureml-examples

Official community-driven Azure Machine Learning examples, tested with GitHub Actions.
https://docs.microsoft.com/azure/machine-learning
MIT License
1.76k stars 1.44k forks source link

The Forecast TCN model deployment through the UI does not work. #2192

Open nick863 opened 1 year ago

nick863 commented 1 year ago

Operating System

Windows

Version Information

Recently we have discovered a problem due to the error in the DNN scoring script file. Please see the workaround in Additional information section. The error was related to the scorings script lines:

result_sample = StandardPythonParameterType({
    'forecast': NumpyParameterType(0.0),
    'index': PandasParameterType(pd.DataFrame({}), enforce_shape=False)
})

The deployment results in a stack trace similar to the one provided below:

  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 74, in load_script
    main_module_spec.loader.exec_module(user_module)
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/azureml-app/outputs/scoring_file_v_2_0_0.py", line 27, in <module>
    'forecast': NumpyParameterType(0.0),
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/inference_schema/parameter_types/numpy_parameter_type.py", line 33, in __init__
    raise Exception("Invalid sample input provided, must provide a sample Numpy array.")
Exception: Invalid sample input provided, must provide a sample Numpy array.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 88, in setup
    self.user_script.load_script(config.app_root)
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 76, in load_script
    raise UserScriptImportException(ex) from ex

The error was already fixed, and the issue is expected to be resolved two weeks after the release of azureml-sdk v. 1.50.0.

Steps to reproduce

  1. In the AzureML workspace UI in the Forecast TCN run in Models section, select one of models and click Deploy.
  2. Select Managed online endpoint.
  3. Answer the necessary questions.
  4. Wait for deployment.

Expected behavior

The functional model should be deployed.

Actual behavior

The deployment will fail and the deployment log will contain a stack trace:

  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 74, in load_script
    main_module_spec.loader.exec_module(user_module)
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/azureml-app/outputs/scoring_file_v_2_0_0.py", line 27, in <module>
    'forecast': NumpyParameterType(0.0),
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/inference_schema/parameter_types/numpy_parameter_type.py", line 33, in __init__
    raise Exception("Invalid sample input provided, must provide a sample Numpy array.")
Exception: Invalid sample input provided, must provide a sample Numpy array.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 88, in setup
    self.user_script.load_script(config.app_root)
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 76, in load_script
    raise UserScriptImportException(ex) from ex

Addition information

Workaround

  1. Download the model of interest from the list of ForecastTCN models. The downloaded archive will contain three files: conda_env_v_1_0_0.yml, model.pt or model.pt and scoring_file_v_2_0_0.py.
  2. Register the model.pt from UI: Register>From Local files.
  3. Select "Unspecified type" and select the model on the local file system, where the files were extraced. Do not select the archive itself.
  4. Name the model and register it.
  5. Edit the conda_env_v_1_0_0.yml file and add the azureml-defaults package of the same version as other azureml package, if it is present, leave the file as is.
  6. At the environment tab create the environment using conda file:
    • Select Start from existing environment
    • Select Container registry image
    • Next to the “docker pull” set the image as follows mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
    • Upload the conda file
  7. Wait until the environment is built. You can monitor it by clicking "Build logs".
  8. Return to models tab.
  9. Select new model and click "Deploy" > Real time endpoint.
  10. Correct the scoring_file_v_2_0_0.py so that instead of

    result_sample = StandardPythonParameterType({
        'forecast': NumpyParameterType(0.0),
        'index': PandasParameterType(pd.DataFrame({}), enforce_shape=False)
    })

    It will contain:

    result_sample = StandardPythonParameterType({
        'forecast': NumpyParameterType(np.array([0])),
        'index': PandasParameterType(pd.DataFrame({}), enforce_shape=False)
    })
  11. Select the built image we have build in section (7) and scoring script scoring_file_v_2_0_0.py and perform the deployment.
nivasvarm commented 1 year ago

Operating System

Windows

Version Information

Recently we have discovered a problem due to the error in the DNN scoring script file. Please see the workaround in Additional information section. The error was related to the scorings script lines:

result_sample = StandardPythonParameterType({
    'forecast': NumpyParameterType(0.0),
    'index': PandasParameterType(pd.DataFrame({}), enforce_shape=False)
})

The deployment results in a stack trace similar to the one provided below:

  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 74, in load_script
    main_module_spec.loader.exec_module(user_module)
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/azureml-app/outputs/scoring_file_v_2_0_0.py", line 27, in <module>
    'forecast': NumpyParameterType(0.0),
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/inference_schema/parameter_types/numpy_parameter_type.py", line 33, in __init__
    raise Exception("Invalid sample input provided, must provide a sample Numpy array.")
Exception: Invalid sample input provided, must provide a sample Numpy array.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 88, in setup
    self.user_script.load_script(config.app_root)
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 76, in load_script
    raise UserScriptImportException(ex) from ex

The error was already fixed, and the issue is expected to be resolved two weeks after the release of azureml-sdk v. 1.50.0.

Steps to reproduce

  1. In the AzureML workspace UI in the Forecast TCN run in Models section, select one of models and click Deploy.
  2. Select Managed online endpoint.
  3. Answer the necessary questions.
  4. Wait for deployment.

Expected behavior

The functional model should be deployed.

Actual behavior

The deployment will fail and the deployment log will contain a stack trace:

  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 74, in load_script
    main_module_spec.loader.exec_module(user_module)
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/azureml-app/outputs/scoring_file_v_2_0_0.py", line 27, in <module>
    'forecast': NumpyParameterType(0.0),
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/inference_schema/parameter_types/numpy_parameter_type.py", line 33, in __init__
    raise Exception("Invalid sample input provided, must provide a sample Numpy array.")
Exception: Invalid sample input provided, must provide a sample Numpy array.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 88, in setup
    self.user_script.load_script(config.app_root)
  File "/azureml-envs/azureml-automl-dnn-forecasting-gpu/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 76, in load_script
    raise UserScriptImportException(ex) from ex

Addition information

Workaround

  1. Download the model of interest from the list of ForecastTCN models. The downloaded archive will contain three files: conda_env_v_1_0_0.yml, model.pt or model.pt and scoring_file_v_2_0_0.py.
  2. Register the model.pt from UI: Register>From Local files.
  3. Select "Unspecified type" and select the model on the local file system, where the files were extraced. Do not select the archive itself.
  4. Name the model and register it.
  5. Edit the conda_env_v_1_0_0.yml file and add the azureml-defaults package of the same version as other azureml package, if it is present, leave the file as is.
  6. At the environment tab create the environment using conda file:

    • Select Start from existing environment
    • Select Container registry image
    • Next to the “docker pull” set the image as follows mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
    • Upload the conda file
  7. Wait until the environment is built. You can monitor it by clicking "Build logs".
  8. Return to models tab.
  9. Select new model and click "Deploy" > Real time endpoint.
  10. Correct the scoring_file_v_2_0_0.py so that instead of

    result_sample = StandardPythonParameterType({
        'forecast': NumpyParameterType(0.0),
        'index': PandasParameterType(pd.DataFrame({}), enforce_shape=False)
    })

    It will contain:

    result_sample = StandardPythonParameterType({
        'forecast': NumpyParameterType(np.array([0])),
        'index': PandasParameterType(pd.DataFrame({}), enforce_shape=False)
    })
  11. Select the built image we have build in section (7) and scoring script scoring_file_v_2_0_0.py and perform the deployment.

All Artifacts are working good the explict code repository are enabled and explicted "amlsdkv2022311/providers/Microsoft.ContainerRegistry/registries/f66391eb71994d96b4a50e3476ad147c' with role definition 'AcrPull' "