推理耗时问题 - Githubissues

Zheng-Chong / CatVTON

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).

Other

951 stars 114 forks source link

推理耗时问题 #46

Closed zeng121 closed 2 months ago

zeng121 commented 2 months ago

你好，我用torch2.1.0+cuda11.8和torch2.2.0+cuda11.8推理同一张图。前者耗时比后者慢了10多秒。我目前定位到耗时主要在model/pipeline.py的去噪部分

latents = self.noise_scheduler.step( noise_pred, t, latents, **extra_step_kwargs ).prev_sample

我看了step源码，没找到具体原因，请大佬解惑