bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
https://bentoml.com
Apache License 2.0
7.13k stars 792 forks source link

Unable to deploy to sagemaker #2321

Closed mauricioalarcon closed 2 years ago

mauricioalarcon commented 2 years ago

Describe the bug Trying to deploy to sagemaker via bentoml sagemaker deploy iris -b IrisClassifier:20211116135509_5F8DAE --api-name predict returns

Error: sagemaker deploy failed: INTERNAL:Failed to build docker image 270286309069.dkr.ecr.us-east-1.amazonaws.com/irisclassifier-sagemaker:20220225160251_ed85f0: The command '/bin/sh -c apt-get update --fix-missing && apt-get install -y nginx && apt-get clean' returned a non-zero code: 100

To Reproduce

Steps to reproduce the issue:

  1. Build IrisClassifier as described on the documentation
  2. Configure aws cli as per this instructions
  3. Have Docker up and Running
  4. run bentoml sagemaker deploy iris -b IrisClassifier:20211116135509_5F8DAE --api-name predict

Expected behavior A new deployment in Sagemaker

Current behavior

bentoml sagemaker deploy iris -b IrisClassifier:20211116135509_5F8DAE --api-name predict --verbose
[2022-02-28 18:23:18,038] DEBUG - Configured logging with simple configuration, level=DEBUG, directory=/Users/userxxxx/bentoml/logs, console_enabled=True, file_enabled=True
[2022-02-28 18:23:18,038] DEBUG - Setting debug mode: ON for current session
AWS Sagemaker deployment functionalities are being migrated to a separate tool and related CLI commands will be deprecated in BentoML itself, please use https://github.com/bentoml/aws-sagemaker-deploy going forward.
[2022-02-28 18:23:18,493] DEBUG - Creating local YataiService instance
[2022-02-28 18:23:18,718] DEBUG - Upgrading tables to the latest revision
Deploying Sagemaker deployment -[2022-02-28 18:23:18,752] DEBUG - Session acquired
[2022-02-28 18:23:18,752] DEBUG -       READ on iris_
[2022-02-28 18:23:18,756] DEBUG - Session released after 0.0038559436798095703s
[2022-02-28 18:23:18,759] DEBUG - Session acquired
[2022-02-28 18:23:18,760] DEBUG -       WRITE on iris_
[2022-02-28 18:23:18,760] DEBUG -       READ on IrisClassifier_20220225160251_ED85F0                                                                                                                                                           /[2022-02-28 18:23:18,862] DEBUG - Session acquired
[2022-02-28 18:23:18,862] DEBUG -       READ on IrisClassifier_20220225160251_ED85F0
[2022-02-28 18:23:18,870] DEBUG - Session released after 0.007516145706176758s                                                                                                                                                                    |
[2022-02-28 18:23:19,045] DEBUG - Created temporary directory: /private/var/folders/64/692xs7fj6zb2f87tbtlkj72w0000gn/T/bentoml-temp-bsiam34a                                                                                                     \
[2022-02-28 18:23:19,928] DEBUG - Getting docker login info from AWS                                                                                                                                                                              /[2022-02-28 18:23:20,109] DEBUG - Getting docker login info from AWS
[2022-02-28 18:23:20,111] DEBUG - Building docker image: 270286309069.dkr.ecr.us-east-1.amazonaws.com/irisclassifier-sagemaker:20220225160251_ed85f0                                                                                           \[2022-02-28 18:23:32,243] ERROR - Failed to build docker image 270286309069.dkr.ecr.us-east-1.amazonaws.com/irisclassifier-sagemaker:20220225160251_ed85f0: The command '/bin/sh -c apt-get update --fix-missing &&     apt-get install -y nginx &&     apt-get clean' returned a non-zero code: 100
[2022-02-28 18:23:32,243] DEBUG - BentoML in debug mode, keeping temp directory "/private/var/folders/64/692xs7fj6zb2f87tbtlkj72w0000gn/T/bentoml-temp-bsiam34a"
[2022-02-28 18:23:32,249] DEBUG - ApplyDeployment (iris, namespace dev) failed: Failed to build docker image 270286309069.dkr.ecr.us-east-1.amazonaws.com/irisclassifier-sagemaker:20220225160251_ed85f0: The command '/bin/sh -c apt-get update --fix-missing &&     apt-get install -y nginx &&     apt-get clean' returned a non-zero code: 100
Error: sagemaker deploy failed: INTERNAL:Failed to build docker image 270286309069.dkr.ecr.us-east-1.amazonaws.com/irisclassifier-sagemaker:20220225160251_ed85f0: The command '/bin/sh -c apt-get update --fix-missing &&     apt-get install -y nginx &&     apt-get clean' returned a non-zero code: 100

Environment:

jjmachan commented 2 years ago

Hey @mauricioalarcon, thanks for reporting this issue!

Using bentoml to deploy to the cloud is now being deprecated and the recommended approach to deploying bentoml services built with 0.13 is to use the scripts here --> https://github.com/bentoml/aws-sagemaker-deploy/tree/pre-v1.0 can you try with this.

Seeing the log messages it seems like it is due to some issues with the base image which should be fixed with the scripts

mauricioalarcon commented 2 years ago

Thank you @jjmachan - I tried using the latest pre release but found a bug. I created an issue on that side, and also created a PR to have this addressed.

jjmachan commented 2 years ago

@mauricioalarcon can't thank you enough 😃! I'm closing this issue now but feel free to open one in the bentoml/aws-sagemaker-deploy repo if you run into any other issues