basetenlabs / truss

The simplest way to serve AI/ML models in production
https://truss.baseten.co
MIT License
926 stars 74 forks source link

Integrate with SageMaker #66

Open Sam152 opened 2 years ago

Sam152 commented 2 years ago

Really like the look of this project. I saw the AWS integration guide, but I was wondering what it'd be like to integrate with SageMaker (https://aws.amazon.com/sagemaker/).

I suspect there might be quite a bit of additional work in creating training and serving containers in the format that SageMaker can consume: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html

In addition to that, I believe the officially supported containers are framework specific and have specific drivers that connect the frameworks to the available GPUs, so I don't know how easy it'd be to even maintain a single set of containers that worked for all frameworks.

pankajroark commented 2 years ago

Afaict, it shouldn't be very hard to support SageMaker, it would just require going one level deeper in the Truss design. Truss creates a standardized model representation that can be operated on with a Context. A context is just a function that does anything with the model, there are contexts for running models locally, exporting to a docker image, starting a docker container etc. I think deployment to SageMaker can be implemented as a Context. It would just require creating the image slightly differently. SageMaker has a slightly different, but not too different, mechanism of providing the model artifacts. Model artifacts -- those provided under the Truss's data directory -- can be uploaded to s3 as a tar file. They can be loaded in the running container from /opt/ml/model; this can be done with a minor change in the model_wrapper.py that Truss normally uses in the docker image's setup code.

This could be a great contribution to Truss. I'd love for someone to pick this up. Otherwise, I can take a stab at it if/when there's enough demand for it.

bolasim commented 1 year ago

Hey @Sam152! We have come up with a new spec for pushing directly to Sagemaker and are hoping to roll in this issue as part of that work (we're already made the server and docker image compatible with Sagemaker, next step is just coordinating the pushing to Sagemaker server by pushing to ECR then creating a sagemaker servince).

If you're interested in contributing, please reach out. We'll try to keep this post updated

enricorotundo commented 1 year ago

hi @bolasim, I've been trying to assess whether it's possible to configure Truss to use model weights hosted on a S3 bucket (manually pushed) and then use these instructions to deploy to Sagemaker (aws sagemaker create-model [...] --ModelDataUrl <bucket-url>. Is this similar to what you're working on in the new spec above (except for the manual push to s3)? Do you have a dedicated branch I can look into?