[ ] I've built my own container based off DLC (and I've attached the code used to build my own image)
Concise Description:
Currently the Sagemaker NVIDIA Triton Inference Containers only support Tensorflow 1
When using Triton server outside of Sagemaker the server can be started with --backend-config=tensorflow,version= to load the correct version of Tensorflow
However there is no SagemakerArg for defining the tensorflow version. As far as we can tell there is not other way to pass in the required argument.
DLC image/dockerfile:
NVIDIA Triton Inference Server 22.05:
007439368137.dkr.ecr.us-east-2.amazonaws.com/sagemaker-tritonserver:22.05-py3
(and at all other Sagemaker triton images)
Is your feature request related to a problem? Please describe.
We are currently loading in a Tensorflow2 model and because of the incorrect tensorflow backend version the performance is worse than expected. And worse than when running Triton without Sagemaker with the correct settings
Describe the solution you'd like
The ability to enable the Tensorflow2 backend in a Triton Sagemaker deployment
Describe alternatives you've considered
We did consider making a patch ourselves but given that there is no documentation on how the image is currently created this is a not a trivial task.
So an additional request would be to also open source how the images are created.
Additional context
We have experience with Sagemaker and we have tested the samen model on Triton on EC2 (with the correct arguments)
Checklist
Concise Description: Currently the Sagemaker NVIDIA Triton Inference Containers only support Tensorflow 1 When using Triton server outside of Sagemaker the server can be started with --backend-config=tensorflow,version= to load the correct version of Tensorflow
With Sagemaker it is possible to pass certain arguments. Which as far as we can find are defined here: https://raw.githubusercontent.com/triton-inference-server/server/main/docker/sagemaker/serve
However there is no SagemakerArg for defining the tensorflow version. As far as we can tell there is not other way to pass in the required argument.
DLC image/dockerfile: NVIDIA Triton Inference Server 22.05: 007439368137.dkr.ecr.us-east-2.amazonaws.com/sagemaker-tritonserver:22.05-py3 (and at all other Sagemaker triton images)
Is your feature request related to a problem? Please describe. We are currently loading in a Tensorflow2 model and because of the incorrect tensorflow backend version the performance is worse than expected. And worse than when running Triton without Sagemaker with the correct settings
Describe the solution you'd like The ability to enable the Tensorflow2 backend in a Triton Sagemaker deployment
Describe alternatives you've considered We did consider making a patch ourselves but given that there is no documentation on how the image is currently created this is a not a trivial task. So an additional request would be to also open source how the images are created.
Additional context We have experience with Sagemaker and we have tested the samen model on Triton on EC2 (with the correct arguments)