Adds the ability to convert loaded model's Unet module into TensortRT. Requires version least after commit 339b5315 (currently, it's the dev
branch after 2023-05-27). Only tested to work on Windows.
Loras are baked in into the converted model. Hypernetwork support is not tested. Controlnet is not supported. Textual inversion works normally.
NVIDIA is also working on releaseing their version of TensorRT for webui, which might be more performant, but they can't release it yet.
There seems to be support for quickly replacing weight of a TensorRT engine without rebuilding it, and this extension does not offer this option yet.
Apart from installing the extension normally, you also need to download zip with TensorRT from NVIDIA.
You need to choose the same version of CUDA as python's torch library is using. For torch 2.0.1 it is CUDA 11.8.
Extract the zip into extension directory, so that TensorRT-8.6.1.6
(or similarly named dir) exists in the same place as scripts
directory and trt_path.py
file. Restart webui afterwards.
You don't need to install CUDA separately.
TensorRT
tab that appears if the extension loads properly.Convert to ONNX
tab, press Convert Unet to ONNX
.
.onnx
file with model in models/Unet-onnx
directory.Convert ONNX to TensorRT
tab, configure the necessary parameters (including writing full path to onnx model) and press Convert ONNX to TensorRT
.
.trt
file with model in models/Unet-trt
directory.Stable Diffusion
page, use SD Unet
option to select newly generated TensorRT model.Stable diffusion 2.0 conversion should fail for both ONNX and TensorRT because of incompatible shapes, but you may be able to rememdy this by chaning instances of 768 to 1024 in the code.