NVIDIA / TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
https://nvidia.github.io/TensorRT-Model-Optimizer
Other
536 stars 39 forks source link

The algorithm output is not the same as after cache_diffusion #92

Open zeng121 opened 2 weeks ago

zeng121 commented 2 weeks ago

When I use cache_diffusion in SD the output is different than before

kevalmorabia97 commented 2 weeks ago

@jingyu-ml

jingyu-ml commented 1 week ago

What cache config are you using? Could you show me some example images and the config? @zeng121