pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.6k stars 350 forks source link

❓ [Question] Is it possible to export unet's tensorrt engine as a file in stable diffusion? #2541

Open 0-chan-kor opened 11 months ago

0-chan-kor commented 11 months ago

❓ Question

Hello. I am currently trying to infer the stable diffusion XL inpaint model using your package. model link : https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1

I referred to your example code and modified it as follows.

import torch

from diffusers import AutoPipelineForInpainting
from diffusers.utils import load_image
import torch_tensorrt

model_id = "diffusers/stable-diffusion-xl-1.0-inpainting-0.1"
device = "cuda"

# Instantiate Stable Diffusion Pipeline with FP16 weights
pipe = AutoPipelineForInpainting.from_pretrained(
    model_id, variant="fp16", torch_dtype=torch.float16
)

pipe = pipe.to(device)
backend = "torch_tensorrt"

# Optimize the UNet portion with Torch-TensorRT
pipe.unet = torch.compile(
    pipe.unet,
    backend=backend,
    options={
        "truncate_long_and_double": True,
        "precision": torch.float16,
    },
    dynamic=False,
)

# %%
# Inference
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

image = load_image(img_url).resize((1024, 1024))
mask_image = load_image(mask_url).resize((1024, 1024))

prompt = "a tiger sitting on a park bench"

image = pipe(
  prompt=prompt,
  image=image,
  mask_image=mask_image,
  guidance_scale=8.0,
  num_inference_steps=20,
  strength=0.99,
  ).images[0]

image.save("inpaint-result.png")

On my gpu machine the conversion to tensorrt takes over 15 minutes. Since I can't do this conversion every time, I'm trying to find a way to save it in file format such as ".trt" file and use it.

When looking in your documentation, it was difficult to find such a feature. Do you support these features? If so, please let me know.

What you have already tried

Described above

Environment

docker container : nvcr.io/nvidia/pytorch:23.11-py3 gpu : p40

Additional context

gs-olive commented 11 months ago

Hi - thank you for the question. Currently, there is not a way to export/serialize an artifact from torch.compile. Our ir="dynamo" path does have serialization capabilities, however, and could be helpful here (it can be invoked via torch_tensorrt.compile(..., ir="dynamo",...). In torch.compile, the generated TRT Engines are stored for the duration of the Python sessions and should not need recompilation for additional inference calls within the same session, but between Python sessions, we do not yet have a caching/saving mechanism.