Boese0601 / MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
https://boese0601.github.io/magicdance/
Other
629 stars 52 forks source link

RuntimeError: shape '[13, 77, 8, 40]' is invalid for input of size 788480 #12

Closed Jeff-Fudan closed 4 months ago

Jeff-Fudan commented 4 months ago

When I train the appearance_control_pretraining using the TikTok data you provided with a batch size of 32, I encounter a RuntimeError: shape '[13, 77, 8, 40]' is invalid for input of size 788480. In reality, 788480=[32,77,8,40]. However, when I set the batch size to 1, everything works fine. The paper mentions a batch size of 64 for the first stage, is this an issue with the code or the data?.Thank you once again for your excellent work! It does a great job of preserving the human id.

Boese0601 commented 4 months ago

Hi, I didn't come across such a problem when I trained the model with train_batch_size=8 on each of the 8 GPUs (64 in total). Can you please retry with train_batch_size=8 and see if the problem exists?

This problem may caused by the dataloader and the data because I used internal data structure and data-loading packages from ByteDance for training rather than pytorch dataloader. I will also look into this problem in the current codebase and get back to u if there's any update asap. Thanks for your feedback!

Boese0601 commented 4 months ago

Hi, I have solved the problem and updated the dataloader tiktok_video_arnold_copy.py and training script train_tiktok.py You should be able to use any value for batch_size now. Thanks.

Jeff-Fudan commented 4 months ago

Thank you so much!