Closed cyanbx closed 3 months ago
I also met a similar problem in training.
Diff-Foley/training/stage2_ldm/adm/modules/diffusionmodules/openai_unetmodel.py", line 744, in forward
h = th.cat([h, hs.pop()], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 4 but got size 3 for tensor number 1 in the list.
hey. Thanks for mentioning. For Stage2 training and inference, we use hop_len 256. For Stage1 training and inference, we use 250. This is for the purpose for temporal alignment.
Hi, thanks for sharing your great work. I'm a little confused with the mel hop length, which is 250 in data_preprocess but 256 in the dataset for training. However, when I change the
hop_len
param ofaudio_video_spec_fullset_Dataset
to 256, I get the following error in diffusion forward:Any help with it? Thanks a lot.