pixeli99 / SVD_Xtend

Stable Video Diffusion Training Code and Extensions.
553 stars 51 forks source link

fix: plural batch size unet input shape bug #7

Closed Kiteretsu77 closed 8 months ago

Kiteretsu77 commented 8 months ago

When I tried to train the model with a batch size larger than 1, the UNET would raise bugs for shape issues due to the input. Based on my debug process, I found that this is because of encoder_hidden_states and sigmas shape issues. After modifying these two variables, the bug doesn't exist.

pixeli99 commented 8 months ago

Thank you for the fix, it has been merged already :)

Kiteretsu77 commented 8 months ago

Thank you!