bentoml / bentoctl

Fast model deployment on any cloud 🚀
https://bentoml.com
Other
175 stars 30 forks source link

PATH in docker image is not set correctly #159

Closed TheisFerre closed 2 years ago

TheisFerre commented 2 years ago

Hi

I am trying to use Bentoml, bentoctl and the aws-sagemaker-deploy operator to deploy a simple sklearn model. The sklearn model is tracked in a MLFlow experiment where i am able to load it and from there build a bento.

When i have build the bento, i want to use bentoctl to create a docker image that is ready to use in AWS Sagemaker, using the aws-sagemaker-deploy operator. However, i don't want to create a new ECR repository or use the terraform template for the infrastructure related to the sagemaker endpoint. Because of that, i run the bentoctl with the --dry-run flag bentoctl build ... --dry-run This works and my container is build successfully. image

After it is built, i push the container to my own ECR repository. To test that the container is compatible with Sagemaker, i created the endpoint-configuration, sagemaker model and endpoint myself. However, when i create the endpoint i get an error.

image

The error is related the the PATH, so i pulled the image and started it locally. Here i saw that the /home/bento folder was indeed missing from the PATH. image

Can you help me figure out why this is the case? In the aws-sagemaker-deploy operater, the last layer of the Docker template is ENV PATH=$BENTO_PATH:$PATH, but it does not seem to work.

TheisFerre commented 2 years ago

So i managed to figure out what the problem is. As mentioned in the issue, it seems that the Dockerfile.Template of the aws-sagemaker-deploy operator does not add theBENTO_PATH to the PATH. I.e. this issue is probably more related to the operator.

To make it work, i added the following. RUN echo "export PATH=$BENTO_PATH:$PATH" >> /root/.bashrc

I am not sure why the existing layer in the Dockerfile is not adding the folder to the path though.

jjmachan commented 2 years ago

@TheisFerre thank you for raising this issue but unfortunately, I still can't reproduce this. If you have joined our slack group can you share your slack ID and I'll reach out to you over there?

If not do you mind sharing docker version and bentoctl operator version

yubozhao commented 2 years ago

@TheisFerre Hello, just want to check in and see where the progress is. Did you get it working?

TheisFerre commented 2 years ago

Thanks for checking in @yubozhao!

I have been in a dialogue with @jjmachan on Slack. We found that the problem seems to be when building the image with Github Actions, as I had no issues when doing it locally. I did find a work-around, so the issue is not blocking me in anyway right now.

I can close the issue here and create a new one in the aws-sagemaker-deploy repo, where i can provide a more detailed description.

yubozhao commented 2 years ago

Great. Thank you for the update. Feel free to open an issue in the sagemaker repo or create a PR. It will helps the community a lot.

Thank you!