chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
MIT License
1.05k stars 59 forks source link

Tensor Dimension Mismatch #122

Open alecyan1993 opened 4 months ago

alecyan1993 commented 4 months ago

Hi, we have the compiled SDXL model consistantly running inference with different dimensions and from time to time some of the jobs will have this following erros that the tensor dimension doesn't match.

However, once extract the job props from this error job and run it locally with compiled model, it can be run with no errors.

RuntimeError('The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript (most recent call
 last):\n/opt/sd/lib/python3.10/site-packages/sfast/jit/overrides.py(21): __torch_function__\n/opt/sd/lib/python3.10/site-
packages/diffusers/models/unet_2d_blocks.py(2452): forward\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1508): _slow_forward\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1527): _call_impl\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1518): _wrapped_call_impl\n/opt/sd/lib/python3.10/site-
packages/diffusers/models/unet_2d_condition.py(1188): forward\n/opt/sd/lib/python3.10/site-
packages/sfast/jit/trace_helper.py(89): forward\n/opt/sd/lib/python3.10/site-packages/torch/nn/modules/module.py(1508): 
_slow_forward\n/opt/sd/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1518): _wrapped_call_impl\n/opt/sd/lib/python3.10/site-
packages/sfast/jit/trace_helper.py(154): forward\n/opt/sd/lib/python3.10/site-packages/torch/nn/modules/module.py(1508): 
_slow_forward\n/opt/sd/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1518): _wrapped_call_impl\n/opt/sd/lib/python3.10/site-packages/torch/jit/_trace.py(1065): 
trace_module\n/opt/sd/lib/python3.10/site-packages/torch/jit/_trace.py(798): trace\n/opt/sd/lib/python3.10/site-
packages/sfast/jit/utils.py(32): better_trace\n/opt/sd/lib/python3.10/site-packages/sfast/jit/trace_helper.py(25): 
trace_with_kwargs\n/opt/sd/lib/python3.10/site-packages/sfast/jit/trace_helper.py(51): wrapper\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1527): _call_impl\n/opt/sd/lib/python3.10/site-
packages/torch/nn/modules/module.py(1518): _wrapped_call_impl\n/app/src/sdxl_pipe_img2img.py(1110): 
__call__\n/opt/sd/lib/python3.10/site-packages/torch/utils/_contextlib.py(115): 
decorate_context\txt2img_task.py(345): run_model\n/app/
: <module>\nRuntimeError: Sizes of tensors must match except in dimension 1. Expected size 
84 but got size 83 for tensor number 1 in the list.\n')
chengzeyi commented 4 months ago

@alecyan1993 I guess some certain versions of PyTorch have bugs with torch.jit.trace. So please check carefully with the version of PyTorch you use.