NVIDIA / TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
https://nvidia.github.io/TensorRT-Model-Optimizer
Other
581 stars 44 forks source link

The algorithm output is not the same as after cache_diffusion #92

Open zeng121 opened 1 month ago

zeng121 commented 1 month ago

When I use cache_diffusion in SD the output is different than before

kevalmorabia97 commented 1 month ago

@jingyu-ml

jingyu-ml commented 3 weeks ago

What cache config are you using? Could you show me some example images and the config? @zeng121