ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
https://els-rd.github.io/transformer-deploy/
Apache License 2.0
1.64k stars 150 forks source link

transformer-deploy on triton server 22.08 #174

Closed lakshaykc closed 1 year ago

lakshaykc commented 1 year ago

I'm trying to setup transformer-deploy on tritonserver:22.08. tritonserver:22.07 had this bug in TensorRT.

I changed the following things to install transformer-deplot on 22.08 image

  1. Changed base image in Dockerfile to FROM nvcr.io/nvidia/tritonserver:22.08-py3
  2. Updated the onnx, onnxruntime and tensorrt versions in the requirements to the latest versions that is compatible with tensorrt 8.6

I'm able to convert a huggingface model using the convert script, however I see an error like this when deploying it on triton inference server. The solution mentioned in the script of importing tensorrt doesn't work.

Any pointers where I can begin to use transformer-deploy with tritonserver:22.08 and tensorrt:8.6.1?

lakshaykc commented 1 year ago

Got it working with 22.12