pixeli99 / SVD_Xtend

Stable Video Diffusion Training Code and Extensions.
481 stars 45 forks source link

Tensor size mismatch error #33

Closed SwayStar123 closed 4 months ago

SwayStar123 commented 4 months ago
Traceback (most recent call last):
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\train_svd.py", line 1286, in <module>
    main()
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\train_svd.py", line 1114, in main
    model_pred = unet(
                 ^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\accelerate\utils\operations.py", line 817, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\accelerate\utils\operations.py", line 805, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\diffusers\models\unets\unet_spatio_temporal_condition.py", line 463, in forward
    sample = upsample_block(
             ^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\MyStuff\Programming\Python\AI\SVD_Xtend\.venv\Lib\site-packages\diffusers\models\unets\unet_3d_blocks.py", line 2351, in forward
    hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list.

I get this error whenever i try to run the script. These are the arguments i use

accelerate launch train_svd.py --pretrained_model_name_or_path=stabilityai/stable-video-diffusion-img2vid-xt-1-1 --per_gpu_batch_size=2 --gradient_accumulation_steps=1 --max_train_steps=5000 --width=10 --height=10 --checkpointing_steps=1000 --checkpoints_total_limit=1 --learning_rate=1e-5 --lr_warmup_steps=0 --seed=123 --mixed_precision="fp16" --validation_steps=200

(Using low resolutions to avoid out of vram issues while testing, same issue happens at higher resolutions too, was not able to run with default resolution on my pc)

Any idea what is causing the error?

SwayStar123 commented 4 months ago

Caused by this pretty sure https://stackoverflow.com/questions/66028743/how-to-handle-odd-resolutions-in-unet-architecture-pytorch