MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.09k stars 21.14k forks source link

Unable to deploy endpoint in Azure ML Studio with Custom Docker Image #116918

Closed abhishekperambai closed 7 months ago

abhishekperambai commented 8 months ago

I am trying to deploy an endpoint in Azure ML Studio. I have created a custom environment by using my own Docker Context which looks something like this below:

Docker Build Context

FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04 RUN pip install azureml-mlflow

RUN apt-get update && apt-get install -y libgl1-mesa-glx RUN sudo apt-get install python3-opencv WORKDIR /app

RUN pip install azureml-defaults \ && pip install azureml-sdk \ && pip install azure.identity \ && pip install azure.keyvault.secrets \ && pip install azure-keyvault \ && pip install pandas \ && pip install unstructured[all-docs] \ && pip install numpy \ && pip install langchain \ && pip install python-pptx \ && pip install tabulate \ && pip install PyMuPDF \

This environment gets created successfully. Subsequently when I try to create an endpoint, the deployment fails with the following error:

Error

Failed to register Environment . Reason:EnvironmentVariables must be null if Docker.BuildContext is specified..

When I tried to research it says I should not specify any environment variables in my yml file as it can create a conflict, but I am not using any environment variables specification in yml file. PFB the content of the yml file

Contents of YML file

pipeline: model_name: dummy experiment_name: doc-parser pipeline_params: placeholder: None published_pipeline: description: doc-parser-dev-endpoint version: 1

storage: fileshare_name: datastore_name: dataset_name:

workspace_settings:

: compute: compute_name: compute_type: compute_cluster inference_cluster: storage_credentials: storage_account_name: storage_account_key: storage_connection_string: dependency: package_manager: conda libs: [ 'azureml-defaults', 'azureml-sdk', 'azure.identity', 'azure.keyvault.secrets', 'azure-keyvault', 'azure-identity', 'pandas', 'opencv-python-headless', 'unstructured[pptx, pdf]', 'numpy', 'langchain', 'python-pptx', 'tabulate', 'PyMuPDF'] Please note that I'm not using the dependency section in yml file as I've already installed the libraries in the docker image. Any help in this regard would be highly appreciated. --- #### Document Details ⚠ *Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.* * ID: 9d0a71e0-df60-1b54-f8af-09839d32d7b0 * Version Independent ID: e2cc1fde-519b-b314-9622-01787aed5467 * Content: [Deploy ML models to Kubernetes Service with v1 - Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service?view=azureml-api-1&tabs=python) * Content Source: [articles/machine-learning/v1/how-to-deploy-azure-kubernetes-service.md](https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/machine-learning/v1/how-to-deploy-azure-kubernetes-service.md) * Service: **machine-learning** * Sub-service: **inferencing** * GitHub Login: @Bozhong68 * Microsoft Alias: **bozhlin**
SaibabaBalapur-MSFT commented 8 months ago

@abhishekperambai It would be great if you could add a link to the documentation you are following for these steps? This would help us redirect the issue to the appropriate team. Thanks!!

abhishekperambai commented 8 months ago

@abhishekperambai It would be great if you could add a link to the documentation you are following for these steps? This would help us redirect the issue to the appropriate team. Thanks!!

I'm following this documentation to deploy the endpoint to AKS cluster.

https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service?view=azureml-api-1&tabs=python

RamanathanChinnappan-MSFT commented 8 months ago

@abhishekperambai The error message you received indicates that there is an issue with the environment variables in your deployment configuration. Specifically, it seems that you have specified environment variables in your deployment configuration, but you have also specified a Docker build context in your environment definition.

According to the error message, if you specify a Docker build context, you cannot also specify environment variables. This is because the environment variables are set in the Dockerfile during the build process, and specifying them separately in the deployment configuration can cause conflicts.

To resolve this issue, you should remove the environment variables from your deployment configuration. You can do this by removing the workspace_settings section from your YAML file, or by setting the environment_variables property to null in your deployment configuration.

Here is an example of how to set the environment_variables property to null in your deployment configuration:

deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1) deployment_config.environment_variables = None Once you have removed the environment variables from your deployment configuration, you should be able to deploy your endpoint successfully.

RamanathanChinnappan-MSFT commented 7 months ago

@abhishekperambai We are going to close this thread. if there are any further questions regarding the documentation, please tag me in your reply and we will be happy to continue the conversation.

abhishekperambai commented 7 months ago

@abhishekperambai We are going to close this thread. if there are any further questions regarding the documentation, please tag me in your reply and we will be happy to continue the conversation.

Well, the suggested solution did not really solve the issue. We were using the Azure SDK v1 to deploy the endpoint and we have observed that for some reason the deploy method of Model class tries to register the environment even if we refer to an already existing environment in the ML studio. Hence, we changed our pipeline code in a way that the environment object is created with a custom docker file and then this is passed on to the inference config object, which in turn is passed to the deploy method of Model class. Then the deploy method works smoothly by registering the environment and the endpoint deployment succeeds.