Open emilwallner opened 1 year ago
You can also extend the PyTorch DLC image, installing TensorRT and its python package in the DockerFile.
hi @emilwallner could you please share how you extended the DLC image with torch-tensorrt? do you also use torch.compile with backend backend="torch_tensorrt" as shown here https://pytorch.org/TensorRT/user_guide/torch_compile.html?
@geraldstanje I used a docker container with a pre-installed TensorRT and used the command line interface to compile it, it takes around 30 min to compile. It was too tedious to install manually and link all the required libraries. However, it was a few years ago and your suggested method might be better now.
And then I load it from the compiled version. Make sure to compile it using the same GPU you'll use for inference. Then I end up using this image for deployment: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-22-12.html
I found the Sagemaker environment too complicated to customize and the deployment was too restricted for my use-case. I ended up with an EC2 auto scaling group, and an ECS Service that uses the auto scaling group, and an AWS load balancer.
Checklist
Concise Description: I'm looking to deploy a TensorRT complied PyTorch model, but the current PyTorch image does not include TensorRT
DLC image/dockerfile: https://github.com/aws/deep-learning-containers/tree/master/pytorch/inference/docker/1.13/py3/cu117
Describe the solution you'd like Add support for TensorRT compiled PyTorch models: https://pytorch.org/TensorRT/getting_started/installation.html#installation
Describe alternatives you've considered I can use the Triton deployment server, but I'd perfer to use torch-serve.