Open AbhinavJangra29 opened 2 weeks ago
Thank you, but the TRT weight is associated with the cuda version and NVDIA GPU, so it cannot be used directly like this.
yes for rtx3090 it works
yes for rtx3090 it works
have you deploy it as inference API at runpod ? what was speed at A10 GPu
Installation Guide
For those running into issues with installation, here’s a streamlined guide! This has been tested on the following container:
runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel-ubuntu22.04
Steps
Clone the Repository
Install Requirements
Install FFmpeg
Install ONNX Runtime
Skip the TensorRT Conversion
Download the pre-built TensorRT weights using the Hugging Face CLI:
Run the Application
Now, you’re ready to run the app in TensorRT mode: