aws / sagemaker-inference-toolkit

Serve machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.
Apache License 2.0
372 stars 82 forks source link

Deploying multiple model artifacts, each having their own inference handler #55

Open ghost opened 4 years ago

ghost commented 4 years ago

What did you find confusing? Please describe. I am trying to deploy multiple tarball model artifacts, to a SageMaker multi-model endpoint, but would like to use different inference handlers for each model - since each model needs different pre-processing and post-processing.

Describe how documentation can be improved I see the documentation is fairly clear on how to specify a custom inference handler, but not clear on whether differing custom handlers can be specified for each model.

Additional context I discovered that a custom handler can be provided to the MMS model archiver here, but it's not clear if this allows different handlers for each model.

I love the inference toolkit, and would sincerely appreciate a response regarding whether it is possible to define differing inference handlers per model, and how to do so.

laurenyu commented 4 years ago

thanks for the kind words! Unfortunately, this isn't currently supported at this time, but I'll leave this issue open as a feature request.

manojlds commented 4 years ago

Took a long time to figure out from reading the code that this isn't supported. Was experimenting with single models first and then wanted to move towards multi-model and unfortunate that it isn't supported.

alext234 commented 4 years ago

+1 for this

ckang244 commented 3 years ago

+1, it's super confusing how this is supposed to all fit together. One would assume that Sagemaker would support the same functionality of the Multi Model Server in the Inference Toolkit. Seems like the only option is to use separate endpoints after all.

n0thing233 commented 2 years ago

any update on this? I'm trying to achieve similar thing. it will be great to have this toolkit support multiple models with each model have its own inference code.