chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
MIT License
1.05k stars 59 forks source link

Inference Error for with Dynamix Resolution #126

Open alecyan1993 opened 4 months ago

alecyan1993 commented 4 months ago

Hi,

I found an issue that if we compile and trace the model with resolution divisible by 32, after that if we inference the model with resolution only divisible by 8 and non-divisible by 32, it would have segmentation fault and this error: RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 42 but got size 41 for tensor number 1 in the list.

For example, if the model is traced with resolution 512x512, then when doing inference for 1200x1200, it would fail.

is this a noted bug for the dynamic resolution with stable-fast? Thanks!

chengzeyi commented 4 months ago

Hi,

I found an issue that if we compile and trace the model with resolution divisible by 32, after that if we inference the model with resolution only divisible by 8 and non-divisible by 32, it would have segmentation fault and this error: RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 42 but got size 41 for tensor number 1 in the list.

For example, if the model is traced with resolution 512x512, then when doing inference for 1200x1200, it would fail.

is this a noted bug for the dynamic resolution with stable-fast? Thanks!

This might be hard to debug. Maybe the original Python code has different branches for them. So I guess the solution could be warping the model and dispatching different groups of sizes into different compiled graphs.