MicrosoftDocs / pipelines-azureml

Example Azure Pipeline to train and deploy a machine learning model using the Azure Machine Learning service
Creative Commons Attribution 4.0 International
114 stars 513 forks source link

Model not found in cache or in root at ./diabetes-model #14

Open methodidacte opened 4 years ago

methodidacte commented 4 years ago

Hello,

Following the different steps of the Azure Pipeline, I got this issue :

"message": "Service deployment polling reached non-successful terminal state, current service state: Unhealthy\nOperation ID: e9252f0d-81f8-44e5-bd6d-983076eca1f5\nMore information can be found using '.get_logs()'\nError:\n{\n \"code\": \"DeploymentTimedOut\",\n \"statusCode\": 504,\n \"message\": \"The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. You can run print(service.state) from the python SDK to retrieve the current state of the webservice.\"\n}

Looking for the logs with get_logs(), I extract this part of the message : Model not found in cache or in root at ./diabetes-model

The az CLI command is the following : az ml model deploy -n diabetes-qa-aci -f model.json --ic config/inference-config.yml --dc config/deployment-config-aci.yml --overwrite -v

And model.json is created by the previous step and contains : { "cpu": "", "createdTime": "2020-06-09T04:57:54.550301+00:00", "description": "", "experimentName": "diabetes-exp", "framework": "Custom", "frameworkVersion": null, "gpu": "", "id": "diabetes_reg_model:2", "memoryInGB": "", "name": "diabetes_reg_model", "properties": "", "runId": "diabetes-exp_1591678184_b25da442", "sampleInputDatasetId": "", "sampleOutputDatasetId": "", "tags": "", "version": 2 }

Any idea ?

methodidacte commented 4 years ago

The model name is hard coded in the score.py function : model_path = Model.get_model_path('diabetes-model')

Is there a way to variabilize it ?

omartin2010 commented 4 years ago

in a nutshell, I would say it's not really possible to do so. There has to be a model name (which corresponds to what you would have registered to Azure Machine Learning in the devops pipeline. That exact name has to be reused there to rehydrate the model when the init function runs.