MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.2k stars 21.35k forks source link

WebserviceException & KubernetesDeploymentFailed -- Error in entry script, ImportError: cannot import name 'Markup' from 'jinja2' #91587

Closed saugatapaul1010 closed 2 years ago

saugatapaul1010 commented 2 years ago

HI,

I am trying to execute a CD pipeline in Azure DevOps, after successfully executing the CI pipeline which automatically triggers the CD pipeline. I have used Microsoft Self-Hosted Agent in both the cases - in the build pipeline and the release pipeline. During the deployment stage to the AKS cluster, I am getting a WebserviceExceptionerror & KubernetesDeploymentFailed with status code 400. I have used the below Azure CLI commands to create an AKS cluster and deploy the model to it. Incidentally, the same pipeline was running perfectly well when I executed this about two months back. Also, there have been 0 changes to the pipeline since. Not able to figure out why this would happen. Any pointers will be appreciated. Is this a version-related issue?

Specifically for this error -- "Error in entry script, ImportError: cannot import name 'Markup' from 'jinja2'", I have included Jinja2 as part of my dependencies both for the CI as well as the CD stage.

Create AKS Cluster:

az ml computetarget create aks -g $(ml.resourceGroup) -w $(ml.workspace) -n $(aks.clusterName) -s $(aks.vmSize) -a $(aks.agentCount) --cluster-purpose $(ml.clusterPurpose)

Deploy ML Model To AKS cluster:

az ml model deploy -g $(ml.resourceGroup) -w $(ml.workspace) -n $(aks_service_name) -f model.json --dc aksDeploymentConfig.yml --ic inferenceConfig.yml --ct $(aks.clusterName) --description "TicketPriority Classifier deployed in AKS" --overwrite

Please find below the error stack trace for the "Deploy ML Model To AKS cluster" task step:

2022-04-15T11:39:07.1368528Z ##[debug]Evaluating condition for step: 'Deploy ML Model To AKS'
2022-04-15T11:39:07.1370392Z ##[debug]Evaluating: succeededOrFailed()
2022-04-15T11:39:07.1370916Z ##[debug]Evaluating succeededOrFailed:
2022-04-15T11:39:07.1374685Z ##[debug]=> True
2022-04-15T11:39:07.1375395Z ##[debug]Result: True
2022-04-15T11:39:07.1376031Z ##[section]Starting: Deploy ML Model To AKS
2022-04-15T11:39:07.1384681Z ==============================================================================
2022-04-15T11:39:07.1384980Z Task         : Azure CLI
2022-04-15T11:39:07.1385475Z Description  : Run Azure CLI commands against an Azure subscription in a PowerShell Core/Shell script when running on Linux agent or PowerShell/PowerShell Core/Batch script when running on Windows agent.
2022-04-15T11:39:07.1385990Z Version      : 2.198.0
2022-04-15T11:39:07.1386200Z Author       : Microsoft Corporation
2022-04-15T11:39:07.1386511Z Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/deploy/azure-cli
2022-04-15T11:39:07.1386953Z ==============================================================================
2022-04-15T11:39:07.1514404Z ##[debug]Using node path: /home/vsts/agents/2.202.1/externals/node10/bin/node
2022-04-15T11:39:07.2522049Z ##[debug]agent.TempDirectory=/home/vsts/work/_temp
2022-04-15T11:39:07.2543217Z ##[debug]loading inputs and endpoints
2022-04-15T11:39:07.2551869Z ##[debug]loading INPUT_CONNECTEDSERVICENAMEARM
2022-04-15T11:39:07.2564437Z ##[debug]loading INPUT_SCRIPTTYPE
2022-04-15T11:39:07.2566258Z ##[debug]loading INPUT_SCRIPTLOCATION
2022-04-15T11:39:07.2567100Z ##[debug]loading INPUT_SCRIPTPATH
2022-04-15T11:39:07.2567963Z ##[debug]loading INPUT_INLINESCRIPT
2022-04-15T11:39:07.2568822Z ##[debug]loading INPUT_POWERSHELLERRORACTIONPREFERENCE
2022-04-15T11:39:07.2569695Z ##[debug]loading INPUT_ADDSPNTOENVIRONMENT
2022-04-15T11:39:07.2570543Z ##[debug]loading INPUT_USEGLOBALCONFIG
2022-04-15T11:39:07.2571359Z ##[debug]loading INPUT_CWD
2022-04-15T11:39:07.2572183Z ##[debug]loading INPUT_FAILONSTANDARDERROR
2022-04-15T11:39:07.2573068Z ##[debug]loading INPUT_POWERSHELLIGNORELASTEXITCODE
2022-04-15T11:39:07.2574687Z ##[debug]loading ENDPOINT_AUTH_f3d4ae49-f833-4f89-b077-478b60d64bcc
2022-04-15T11:39:07.2575884Z ##[debug]loading ENDPOINT_AUTH_SCHEME_f3d4ae49-f833-4f89-b077-478b60d64bcc
2022-04-15T11:39:07.2577034Z ##[debug]loading ENDPOINT_AUTH_PARAMETER_f3d4ae49-f833-4f89-b077-478b60d64bcc_TENANTID
2022-04-15T11:39:07.2578170Z ##[debug]loading ENDPOINT_AUTH_PARAMETER_f3d4ae49-f833-4f89-b077-478b60d64bcc_SERVICEPRINCIPALID
2022-04-15T11:39:07.2579333Z ##[debug]loading ENDPOINT_AUTH_PARAMETER_f3d4ae49-f833-4f89-b077-478b60d64bcc_AUTHENTICATIONTYPE
2022-04-15T11:39:07.2581857Z ##[debug]loading ENDPOINT_AUTH_PARAMETER_f3d4ae49-f833-4f89-b077-478b60d64bcc_SCOPE
2022-04-15T11:39:07.2584035Z ##[debug]loading ENDPOINT_AUTH_PARAMETER_f3d4ae49-f833-4f89-b077-478b60d64bcc_SERVICEPRINCIPALKEY
2022-04-15T11:39:07.2585991Z ##[debug]loading ENDPOINT_AUTH_SYSTEMVSSCONNECTION
2022-04-15T11:39:07.2587888Z ##[debug]loading ENDPOINT_AUTH_SCHEME_SYSTEMVSSCONNECTION
2022-04-15T11:39:07.2590865Z ##[debug]loading ENDPOINT_AUTH_PARAMETER_SYSTEMVSSCONNECTION_ACCESSTOKEN
2022-04-15T11:39:07.2591524Z ##[debug]loaded 21
2022-04-15T11:39:07.2592083Z ##[debug]Agent.ProxyUrl=undefined
2022-04-15T11:39:07.2592669Z ##[debug]Agent.CAInfo=undefined
2022-04-15T11:39:07.2593281Z ##[debug]Agent.ClientCert=undefined
2022-04-15T11:39:07.2593899Z ##[debug]Agent.SkipCertValidation=undefined
2022-04-15T11:39:07.2601322Z ##[debug]check path : /home/vsts/work/_tasks/AzureCLI_46e4be58-730b-4389-8a2f-ea10b3e5e815/2.198.0/task.json
2022-04-15T11:39:07.2604689Z ##[debug]adding resource file: /home/vsts/work/_tasks/AzureCLI_46e4be58-730b-4389-8a2f-ea10b3e5e815/2.198.0/task.json
2022-04-15T11:39:07.2605518Z ##[debug]system.culture=en-US
2022-04-15T11:39:07.2615600Z ##[debug]which 'az'
2022-04-15T11:39:07.2619558Z ##[debug]found: '/opt/hostedtoolcache/Python/3.7.12/x64/bin/az'
2022-04-15T11:39:07.2631786Z ##[debug]scriptType=bash
2022-04-15T11:39:07.2633366Z ##[debug]scriptLocation=inlineScript
2022-04-15T11:39:07.2634238Z ##[debug]scriptArguments=undefined
2022-04-15T11:39:07.2637780Z ##[debug]Agent.TempDirectory=/home/vsts/work/_temp
2022-04-15T11:39:07.2639001Z ##[debug]inlineScript=az ml model deploy -g SyntbotsAI-RnD-MLOps -w MLOps_WS01 -n priority-predictor-aks -f model.json --dc aksDeploymentConfig.yml --ic inferenceConfig.yml --ct aks --description "TicketPriority Classifier deployed in AKS" --overwrite
2022-04-15T11:39:07.2648783Z ##[debug]which 'bash'
2022-04-15T11:39:07.2656296Z ##[debug]found: '/usr/bin/bash'
2022-04-15T11:39:07.2657013Z ##[debug]which '/usr/bin/bash'
2022-04-15T11:39:07.2659450Z ##[debug]found: '/usr/bin/bash'
2022-04-15T11:39:07.2661073Z ##[debug]/usr/bin/bash arg: /home/vsts/work/_temp/azureclitaskscript1650022747263.sh
2022-04-15T11:39:07.2662828Z ##[debug]cwd=/home/vsts/work/r1/a/_Ticket Priority-CI/TicketPriorityClassifier/a
2022-04-15T11:39:07.2663544Z ##[debug]scriptLocation=inlineScript
2022-04-15T11:39:07.2664158Z ##[debug]failOnStandardError=false
2022-04-15T11:39:07.2664960Z ##[debug]testing directory '/home/vsts/work/r1/a/_Ticket Priority-CI/TicketPriorityClassifier/a'
2022-04-15T11:39:07.2698899Z ##[debug]which 'az'
2022-04-15T11:39:07.2701925Z ##[debug]found: '/opt/hostedtoolcache/Python/3.7.12/x64/bin/az'
2022-04-15T11:39:07.2702763Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: --version
2022-04-15T11:39:07.2703580Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: --version
2022-04-15T11:39:07.2705162Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:07.2705884Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:07.2706520Z ##[debug]arguments:
2022-04-15T11:39:07.2707058Z ##[debug]arguments:
2022-04-15T11:39:07.2707691Z ##[debug]   --version
2022-04-15T11:39:07.2708306Z ##[debug]   --version
2022-04-15T11:39:07.2708952Z [command]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az --version
2022-04-15T11:39:08.6852204Z azure-cli                         2.22.0 *
2022-04-15T11:39:08.6852949Z 
2022-04-15T11:39:08.6853356Z core                              2.22.0 *
2022-04-15T11:39:08.6853788Z telemetry                          1.0.6
2022-04-15T11:39:08.6854024Z 
2022-04-15T11:39:08.6854366Z Extensions:
2022-04-15T11:39:08.6854912Z azure-cli-ml                      1.37.0
2022-04-15T11:39:08.6855488Z azure-devops                      0.25.0
2022-04-15T11:39:08.6855741Z 
2022-04-15T11:39:08.6856356Z Python location '/opt/hostedtoolcache/Python/3.7.12/x64/bin/python'
2022-04-15T11:39:08.6857038Z Extensions directory '/opt/az/azcliextensions'
2022-04-15T11:39:08.6857653Z 
2022-04-15T11:39:08.6858123Z Python (Linux) 3.7.12 (default, Sep  6 2021, 07:19:30) 
2022-04-15T11:39:08.6858569Z [GCC 9.3.0]
2022-04-15T11:39:08.6858799Z 
2022-04-15T11:39:08.6859183Z Legal docs and information: aka.ms/AzureCliLegal
2022-04-15T11:39:08.6859478Z 
2022-04-15T11:39:08.6859666Z 
2022-04-15T11:39:08.6860522Z ##[debug]useGlobalConfig=false
2022-04-15T11:39:08.6861284Z ##[debug]Agent.TempDirectory=/home/vsts/work/_temp
2022-04-15T11:39:08.6862098Z ##[debug]Agent.TempDirectory=/home/vsts/work/_temp
2022-04-15T11:39:08.6866428Z WARNING: You have 2 updates available. Consider updating your CLI installation with 'az upgrade'
2022-04-15T11:39:08.6866735Z 
2022-04-15T11:39:08.6867036Z Please let us know how we are doing: https://aka.ms/azureclihats
2022-04-15T11:39:08.6867706Z and let us know if you're interested in trying out our newest features: https://aka.ms/CLIUXstudy
2022-04-15T11:39:08.6872931Z Setting AZURE_CONFIG_DIR env variable to: /home/vsts/work/_temp/.azclitask
2022-04-15T11:39:08.6879280Z ##[debug]connectedServiceNameARM=f3d4ae49-f833-4f89-b077-478b60d64bcc
2022-04-15T11:39:08.6880166Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc data environment = AzureCloud
2022-04-15T11:39:08.6900008Z Setting active cloud to: AzureCloud
2022-04-15T11:39:08.6900993Z ##[debug]which 'az'
2022-04-15T11:39:08.6901727Z ##[debug]found: '/opt/hostedtoolcache/Python/3.7.12/x64/bin/az'
2022-04-15T11:39:08.6902797Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: cloud set -n AzureCloud
2022-04-15T11:39:08.6903668Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: cloud set -n AzureCloud
2022-04-15T11:39:08.6904409Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:08.6905107Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:08.6905720Z ##[debug]arguments:
2022-04-15T11:39:08.6906269Z ##[debug]arguments:
2022-04-15T11:39:08.6906797Z ##[debug]   cloud
2022-04-15T11:39:08.6907315Z ##[debug]   cloud
2022-04-15T11:39:08.6907847Z ##[debug]   set
2022-04-15T11:39:08.6908372Z ##[debug]   set
2022-04-15T11:39:08.6908957Z ##[debug]   -n
2022-04-15T11:39:08.6909540Z ##[debug]   -n
2022-04-15T11:39:08.6910071Z ##[debug]   AzureCloud
2022-04-15T11:39:08.6910623Z ##[debug]   AzureCloud
2022-04-15T11:39:08.6911268Z [command]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az cloud set -n AzureCloud
2022-04-15T11:39:08.8827497Z ##[debug]connectedServiceNameARM=f3d4ae49-f833-4f89-b077-478b60d64bcc
2022-04-15T11:39:09.8424043Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc auth scheme = ServicePrincipal
2022-04-15T11:39:09.8425950Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc data SubscriptionID = b53cd405-74a5-4714-9549-88af4dc84f66
2022-04-15T11:39:09.8427119Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc auth param authenticationType = spnKey
2022-04-15T11:39:09.8428579Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc auth param serviceprincipalid = ***
2022-04-15T11:39:09.8429777Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc auth param tenantid = 12873845-9729-49dd-b821-76b52a360a58
2022-04-15T11:39:09.8430746Z ##[debug]key based endpoint
2022-04-15T11:39:09.8433029Z ##[debug]f3d4ae49-f833-4f89-b077-478b60d64bcc auth param serviceprincipalkey = ***
2022-04-15T11:39:09.8435114Z ##[debug]Processed: ##vso[task.setsecret]***
2022-04-15T11:39:09.8435999Z ##[debug]which 'az'
2022-04-15T11:39:09.8436817Z ##[debug]found: '/opt/hostedtoolcache/Python/3.7.12/x64/bin/az'
2022-04-15T11:39:09.8438816Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: login --service-principal -u "***" --password=*** --tenant "12873845-9729-49dd-b821-76b52a360a58" --allow-no-subscriptions
2022-04-15T11:39:09.8440753Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: login --service-principal -u "***" --password=*** --tenant "12873845-9729-49dd-b821-76b52a360a58" --allow-no-subscriptions
2022-04-15T11:39:09.8441829Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:09.8442830Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:09.8443814Z ##[debug]arguments:
2022-04-15T11:39:09.8444443Z ##[debug]arguments:
2022-04-15T11:39:09.8445067Z ##[debug]   login
2022-04-15T11:39:09.8445672Z ##[debug]   login
2022-04-15T11:39:09.8446454Z ##[debug]   --service-principal
2022-04-15T11:39:09.8447211Z ##[debug]   --service-principal
2022-04-15T11:39:09.8447946Z ##[debug]   -u
2022-04-15T11:39:09.8448618Z ##[debug]   -u
2022-04-15T11:39:09.8449810Z ##[debug]   ***
2022-04-15T11:39:09.8450715Z ##[debug]   ***
2022-04-15T11:39:09.8451761Z ##[debug]   --password=***
2022-04-15T11:39:09.8452733Z ##[debug]   --password=***
2022-04-15T11:39:09.8453493Z ##[debug]   --tenant
2022-04-15T11:39:09.8454201Z ##[debug]   --tenant
2022-04-15T11:39:09.8454963Z ##[debug]   12873845-9729-49dd-b821-76b52a360a58
2022-04-15T11:39:09.8455812Z ##[debug]   12873845-9729-49dd-b821-76b52a360a58
2022-04-15T11:39:09.8456614Z ##[debug]   --allow-no-subscriptions
2022-04-15T11:39:09.8457388Z ##[debug]   --allow-no-subscriptions
2022-04-15T11:39:09.8459019Z [command]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az login --service-principal -u *** --password=*** --tenant 12873845-9729-49dd-b821-76b52a360a58 --allow-no-subscriptions
2022-04-15T11:39:10.0541137Z [
2022-04-15T11:39:10.0541736Z   {
2022-04-15T11:39:10.0542152Z     "cloudName": "AzureCloud",
2022-04-15T11:39:10.0543272Z     "homeTenantId": "12873845-9729-49dd-b821-76b52a360a58",
2022-04-15T11:39:10.0544310Z     "id": "b53cd405-74a5-4714-9549-88af4dc84f66",
2022-04-15T11:39:10.0544778Z     "isDefault": true,
2022-04-15T11:39:10.0545170Z     "managedByTenants": [],
2022-04-15T11:39:10.0545564Z     "name": "Azure subscription 1",
2022-04-15T11:39:10.0545970Z     "state": "Enabled",
2022-04-15T11:39:10.0546568Z     "tenantId": "12873845-9729-49dd-b821-76b52a360a58",
2022-04-15T11:39:10.0547022Z     "user": {
2022-04-15T11:39:10.0548040Z       "name": "***",
2022-04-15T11:39:10.0548462Z       "type": "servicePrincipal"
2022-04-15T11:39:10.0548817Z     }
2022-04-15T11:39:10.0549155Z   }
2022-04-15T11:39:10.0549456Z ]
2022-04-15T11:39:10.0550430Z ##[debug]which 'az'
2022-04-15T11:39:10.0551315Z ##[debug]found: '/opt/hostedtoolcache/Python/3.7.12/x64/bin/az'
2022-04-15T11:39:10.0552365Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: account set --subscription "b53cd405-74a5-4714-9549-88af4dc84f66"
2022-04-15T11:39:10.0553538Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg: account set --subscription "b53cd405-74a5-4714-9549-88af4dc84f66"
2022-04-15T11:39:10.0554496Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:10.0555341Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:39:10.0556080Z ##[debug]arguments:
2022-04-15T11:39:10.0556744Z ##[debug]arguments:
2022-04-15T11:39:10.0557411Z ##[debug]   account
2022-04-15T11:39:10.0558064Z ##[debug]   account
2022-04-15T11:39:10.0558707Z ##[debug]   set
2022-04-15T11:39:10.0559350Z ##[debug]   set
2022-04-15T11:39:10.0560114Z ##[debug]   --subscription
2022-04-15T11:39:10.0560901Z ##[debug]   --subscription
2022-04-15T11:39:10.0561727Z ##[debug]   b53cd405-74a5-4714-9549-88af4dc84f66
2022-04-15T11:39:10.0562923Z ##[debug]   b53cd405-74a5-4714-9549-88af4dc84f66
2022-04-15T11:39:10.0563915Z [command]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az account set --subscription b53cd405-74a5-4714-9549-88af4dc84f66
2022-04-15T11:39:10.2627408Z ##[debug]addSpnToEnvironment=false
2022-04-15T11:39:10.2630781Z ##[debug]exec tool: /usr/bin/bash
2022-04-15T11:39:10.2631640Z ##[debug]arguments:
2022-04-15T11:39:10.2632496Z ##[debug]   /home/vsts/work/_temp/azureclitaskscript1650022747263.sh
2022-04-15T11:39:10.2633263Z [command]/usr/bin/bash /home/vsts/work/_temp/azureclitaskscript1650022747263.sh
2022-04-15T11:40:54.8718857Z ERROR: {'Azure-cli-ml Version': '1.37.0', 'Error': WebserviceException:
2022-04-15T11:40:54.8720378Z    Message: Service deployment polling reached non-successful terminal state, current service state: Failed
2022-04-15T11:40:54.8721406Z Operation ID: 0423148f-a9bb-4727-8c85-a8521c150db6
2022-04-15T11:40:54.8722902Z More information can be found using '.get_logs()'
2022-04-15T11:40:54.8723482Z Error:
2022-04-15T11:40:54.8724339Z {
2022-04-15T11:40:54.8724718Z   "code": "KubernetesDeploymentFailed",
2022-04-15T11:40:54.8725018Z   "statusCode": 400,
2022-04-15T11:40:54.8725391Z   "message": "Kubernetes Deployment failed",
2022-04-15T11:40:54.8725693Z   "details": [
2022-04-15T11:40:54.8725927Z     {
2022-04-15T11:40:54.8726178Z       "code": "CrashLoopBackOff",
2022-04-15T11:40:54.8727586Z       "message": "Error in entry script, ImportError: cannot import name 'Markup' from 'jinja2' (/azureml-envs/azureml_f519b1e1dda3dd87ba0a24fcd626a531/lib/python3.7/site-packages/jinja2/__init__.py), please run print(service.get_logs()) to get details."
2022-04-15T11:40:54.8728258Z     },
2022-04-15T11:40:54.8728470Z     {
2022-04-15T11:40:54.8728724Z       "code": "DeploymentFailed",
2022-04-15T11:40:54.8729111Z       "message": "Your container endpoint is not available. Please follow the steps to debug:
2022-04-15T11:40:54.8729773Z    1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.
2022-04-15T11:40:54.8730942Z    2. You can also interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
2022-04-15T11:40:54.8732481Z    3. For AKS deployment with custom certificate, you need to update your DNS record to point to the IP address of scoring endpoint. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-secure-web-service#update-your-dns for more information.
2022-04-15T11:40:54.8733291Z    4. View the diagnostic events to check status of container, it may help you to debug the issue.
2022-04-15T11:40:54.8734440Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Warning","Reason":"FailedScheduling","Message":"0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.","LastTimestamp":null}
2022-04-15T11:40:54.8735950Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Scheduled","Message":"Successfully assigned azureml-mlops-ws01/priority-predictor-aks-66b46b95ff-ldth8 to aks-agentpool-28648132-vmss000000","LastTimestamp":null}
2022-04-15T11:40:54.8737493Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Pulled","Message":"Container image "mcr.microsoft.com/azureml/dependency-unpacker:20210714" already present on machine","LastTimestamp":"2022-04-15T11:40:29Z"}
2022-04-15T11:40:54.8738805Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Created","Message":"Created container amlappinit","LastTimestamp":"2022-04-15T11:40:29Z"}
2022-04-15T11:40:54.8739958Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Started","Message":"Started container amlappinit","LastTimestamp":"2022-04-15T11:40:29Z"}
2022-04-15T11:40:54.8741227Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Warning","Reason":"Unhealthy","Message":"Readiness probe failed: HTTP probe failed with statuscode: 502","LastTimestamp":"2022-04-15T11:40:34Z"}
2022-04-15T11:40:54.8742769Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Pulled","Message":"Container image "fa9d2517040e4c2ebe65676d5c708710.azurecr.io/azureml/azureml_7edc277d6c8d8b206349eb9df96c2342" already present on machine","LastTimestamp":"2022-04-15T11:40:35Z"}
2022-04-15T11:40:54.8744162Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Created","Message":"Created container priority-predictor-aks","LastTimestamp":"2022-04-15T11:40:35Z"}
2022-04-15T11:40:54.8745491Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Normal","Reason":"Started","Message":"Started container priority-predictor-aks","LastTimestamp":"2022-04-15T11:40:35Z"}
2022-04-15T11:40:54.8746694Z {"InvolvedObject":"priority-predictor-aks-66b46b95ff-ldth8","InvolvedKind":"Pod","Type":"Warning","Reason":"BackOff","Message":"Back-off restarting failed container","LastTimestamp":"2022-04-15T11:40:39Z"}
2022-04-15T11:40:54.8747274Z "
2022-04-15T11:40:54.8747481Z     }
2022-04-15T11:40:54.8747671Z   ]
2022-04-15T11:40:54.8747878Z }
2022-04-15T11:40:54.8748113Z    InnerException None
2022-04-15T11:40:54.8748358Z    ErrorResponse 
2022-04-15T11:40:54.8748628Z {
2022-04-15T11:40:54.8748830Z     "error": {
2022-04-15T11:40:54.8760211Z         "message": "Service deployment polling reached non-successful terminal state, current service state: Failed\nOperation ID: 0423148f-a9bb-4727-8c85-a8521c150db6\nMore information can be found using '.get_logs()'\nError:\n{\n  \"code\": \"KubernetesDeploymentFailed\",\n  \"statusCode\": 400,\n  \"message\": \"Kubernetes Deployment failed\",\n  \"details\": [\n    {\n      \"code\": \"CrashLoopBackOff\",\n      \"message\": \"Error in entry script, ImportError: cannot import name 'Markup' from 'jinja2' (/azureml-envs/azureml_f519b1e1dda3dd87ba0a24fcd626a531/lib/python3.7/site-packages/jinja2/__init__.py), please run print(service.get_logs()) to get details.\"\n    },\n    {\n      \"code\": \"DeploymentFailed\",\n      \"message\": \"Your container endpoint is not available. Please follow the steps to debug:\n\t1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.\n\t2. You can also interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t3. For AKS deployment with custom certificate, you need to update your DNS record to point to the IP address of scoring endpoint. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-secure-web-service#update-your-dns for more information.\n\t4. View the diagnostic events to check status of container, it may help you to debug the issue.\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Warning\",\"Reason\":\"FailedScheduling\",\"Message\":\"0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.\",\"LastTimestamp\":null}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Scheduled\",\"Message\":\"Successfully assigned azureml-mlops-ws01/priority-predictor-aks-66b46b95ff-ldth8 to aks-agentpool-28648132-vmss000000\",\"LastTimestamp\":null}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Pulled\",\"Message\":\"Container image \"mcr.microsoft.com/azureml/dependency-unpacker:20210714\" already present on machine\",\"LastTimestamp\":\"2022-04-15T11:40:29Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Created\",\"Message\":\"Created container amlappinit\",\"LastTimestamp\":\"2022-04-15T11:40:29Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Started\",\"Message\":\"Started container amlappinit\",\"LastTimestamp\":\"2022-04-15T11:40:29Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Warning\",\"Reason\":\"Unhealthy\",\"Message\":\"Readiness probe failed: HTTP probe failed with statuscode: 502\",\"LastTimestamp\":\"2022-04-15T11:40:34Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Pulled\",\"Message\":\"Container image \"fa9d2517040e4c2ebe65676d5c708710.azurecr.io/azureml/azureml_7edc277d6c8d8b206349eb9df96c2342\" already present on machine\",\"LastTimestamp\":\"2022-04-15T11:40:35Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Created\",\"Message\":\"Created container priority-predictor-aks\",\"LastTimestamp\":\"2022-04-15T11:40:35Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Normal\",\"Reason\":\"Started\",\"Message\":\"Started container priority-predictor-aks\",\"LastTimestamp\":\"2022-04-15T11:40:35Z\"}\n{\"InvolvedObject\":\"priority-predictor-aks-66b46b95ff-ldth8\",\"InvolvedKind\":\"Pod\",\"Type\":\"Warning\",\"Reason\":\"BackOff\",\"Message\":\"Back-off restarting failed container\",\"LastTimestamp\":\"2022-04-15T11:40:39Z\"}\n\"\n    }\n  ]\n}"
2022-04-15T11:40:54.8768343Z     }
2022-04-15T11:40:54.8768568Z }}
2022-04-15T11:40:55.1482735Z ##[debug]Exit code 1 received from tool '/usr/bin/bash'
2022-04-15T11:40:55.1485829Z ##[debug]STDIO streams have closed for tool '/usr/bin/bash'
2022-04-15T11:40:55.1498579Z ##[debug]task result: Failed
2022-04-15T11:40:55.1500313Z ##[error]Script failed with exit code: 1
2022-04-15T11:40:55.1501933Z ##[debug]Processed: ##vso[task.issue type=error;]Script failed with exit code: 1
2022-04-15T11:40:55.1503732Z ##[debug]Processed: ##vso[task.complete result=Failed;]Script failed with exit code: 1
2022-04-15T11:40:55.1508979Z ##[debug]which 'az'
2022-04-15T11:40:55.1510045Z ##[debug]found: '/opt/hostedtoolcache/Python/3.7.12/x64/bin/az'
2022-04-15T11:40:55.1510998Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg:  account clear
2022-04-15T11:40:55.1511951Z ##[debug]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az arg:  account clear
2022-04-15T11:40:55.1512869Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:40:55.1513777Z ##[debug]exec tool: /opt/hostedtoolcache/Python/3.7.12/x64/bin/az
2022-04-15T11:40:55.1514593Z ##[debug]arguments:
2022-04-15T11:40:55.1516173Z ##[debug]arguments:
2022-04-15T11:40:55.1516893Z ##[debug]   account
2022-04-15T11:40:55.1517576Z ##[debug]   account
2022-04-15T11:40:55.1518236Z ##[debug]   clear
2022-04-15T11:40:55.1518888Z ##[debug]   clear
2022-04-15T11:40:55.1519447Z [command]/opt/hostedtoolcache/Python/3.7.12/x64/bin/az account clear
2022-04-15T11:40:55.3631130Z ##[section]Finishing: Deploy ML Model To AKS

Please find the Install Dependencies for CI stage.

name: iris_demo
channels:
  - defaults
  - conda-forge
dependencies:
  # The python interpreter version.
  # Currently Azure ML Workbench only supports 3.5.2 and later.
  - python=3.7
  - pip>=19.1.1
  - numpy #>=1.16.5
  - pandas #==0.25.1
  - pytest>=3.6.4
  - pip:
      - azureml-sdk==1.17.0
      #- azureml-sdk==1.17.0
      - black>=18.6b4
      - cached-property==1.5.1
      - jsonlines>=1.2.0
      - nteract-scrapbook>=0.2.1
      - pydocumentdb>=2.3.3
      - #tqdm==4.32.2
      - tqdm
      - pandas
      - pyemd==0.5.1
      - ipywebrtc==0.4.3
      - pre-commit>=1.14.4
      - scipy
      - jinja2
      #- scikit-learn>=0.19.0,<=0.20.3
      - scikit-learn==0.22.1
      - requests==2.22.0
      - requests-oauthlib==1.2.0
      - regex==2020.2.20
      - seaborn
      - joblib
      - mlxtend==0.18.0
      - spacy==2.3.2
      - matplotlib
      - https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz
      - imblearn
      - numpy

I have also installed these packages as part of the build and release steps.

pip install --upgrade azure-cli==2.22.0
pip install --upgrade azureml-sdk[cli]
pip install pytest
pip install pytest-cov
pip install jinja2
#pip install -r requirements.txt

This the YAML file for the build stage (CI)

# Variable 'ml.computeIdelSecs' was defined in the Variables tab
# Variable 'ml.experimentName' was defined in the Variables tab
# Variable Group 'Ticket Priority Build Variables' was defined in the Variables tab
jobs:
- job: Job_1
  displayName: Agent job 1
  pool:
    vmImage: ubuntu-20.04
  steps:
  - checkout: self
  - task: UsePythonVersion@0
    displayName: Use Python 3.7
    inputs:
      versionSpec: 3.7
  - task: Bash@3
    displayName: Install Requirements
    inputs:
      filePath: environment_setup/install-requirements.sh
      workingDirectory: environment_setup
  - task: AzureCLI@2
    displayName: Install Azure CLI ML Extension
    inputs:
      connectedServiceNameARM: f3d4ae49-f833-4f89-b077-478b60d64bcc
      scriptType: bash
      scriptLocation: inlineScript
      inlineScript: az extension add -n azure-cli-ml
  - task: AzureCLI@2
    displayName: Create/Use ML Workspace
    inputs:
      connectedServiceNameARM: f3d4ae49-f833-4f89-b077-478b60d64bcc
      scriptType: bash
      scriptLocation: inlineScript
      inlineScript: az ml workspace create -g $(ml.resourceGroup) -w $(ml.workspace) -l $(ml.region) --exist-ok --yes
  - task: AzureCLI@2
    displayName: Create/Use Compute Target
    inputs:
      connectedServiceNameARM: f3d4ae49-f833-4f89-b077-478b60d64bcc
      scriptType: bash
      scriptLocation: inlineScript
      inlineScript: az ml computetarget create amlcompute -g $(ml.resourceGroup) -w $(ml.workspace) -n $(ml.computeName) -s $(ml.computeVMSize) --min-nodes $(ml.computeMinNodes) --max-nodes $(ml.computeMaxNodes) --idle-seconds-before-scaledown $(ml.computeIdelSecs)
  - task: AzureCLI@2
    displayName: Upload Data to Default Data Store
    inputs:
      connectedServiceNameARM: f3d4ae49-f833-4f89-b077-478b60d64bcc
      scriptType: bash
      scriptLocation: inlineScript
      inlineScript: az ml datastore upload -w $(ml.workspace) -g $(ml.resourceGroup) -n $(az ml datastore show-default -w $(ml.workspace) -g $(ml.resourceGroup) --query name -o tsv) -p data -u Ticket_Priority_Data
  - task: Bash@3
    displayName: Create Metadata & Model folders
    inputs:
      targetType: inline
      script: >
        mkdir metadata && mkdir models
  - task: AzureCLI@2
    displayName: Training Model
    inputs:
      connectedServiceNameARM: f3d4ae49-f833-4f89-b077-478b60d64bcc
      scriptType: bash
      scriptLocation: inlineScript
      inlineScript: az ml run submit-script -g $(ml.resourceGroup) -w $(ml.workspace) -e $(ml.experimentName) --ct $(ml.computeName) -c training --source-directory . --path environment_setup -t ./metadata/run.json training.py --container_name Ticket_Priority_Data --input_csv final_data_priority.csv --model_path ./models/ticketpriority_model.pkl --artifact_loc ./outputs/models/ --dataset_name ticketpriority_ds --dataset_desc "Ticket Priority Data"
  - task: AzureCLI@2
    displayName: Register Model in to Model Registry
    inputs:
      connectedServiceNameARM: f3d4ae49-f833-4f89-b077-478b60d64bcc
      scriptType: bash
      scriptLocation: inlineScript
      inlineScript: az ml model register -g $(ml.resourceGroup) -w $(ml.workspace) -n TicketPriority --asset-path outputs/models/ -d "TicketPriority LR Classifier" --tag "model"="Stacking Classifier"  --model-framework Custom -f ./metadata/run.json -t metadata/model.json
  - task: CopyFiles@2
    displayName: Copy File to Pipeline Artifact
    inputs:
      SourceFolder: $(Build.SourcesDirectory)
      Contents: >-
        **/metadata/*

        **/environment_setup/*

        **/deployment/*

        **/inference/*

        **/tests/smoke/*

        **/outputs/prediction.csv
      TargetFolder: $(Build.ArtifactStagingDirectory)
      CleanTargetFolder: true
      OverWrite: true
      flattenFolders: true
      preserveTimestamp: true
  - task: PublishPipelineArtifact@1
    displayName: Publish Pipeline Artifact
    inputs:
      artifactName: TicketPriorityClassifier
...
Karishma-Tiwari-MSFT commented 2 years ago

Thanks for posting your query. This azure-docs repository deals about feedback related to particular Azure document page (like correcting doc bugs, doc enhancements, product issues related to wrong doc instructions, etc.) so if issue is intended to update / correct / enhance anything in an Azure document then please share Azure document link / URL and share the feedback related to it.

Else, if you are looking for help with troubleshooting, how-to issues, I would suggest reaching out to the Microsoft Q&A and Stack Overflow communities with the relevant tags or opening a support ticket.

Karishma-Tiwari-MSFT commented 2 years ago

We will now close this issue. Please share the requested details about document url and tag me in a comment. I will reopen it and we will gladly continue the discussion.

thomasfrederikhoeck commented 2 years ago

@saugatapaul1010 did you manage to solve it? We are seeing same error.