Thank you for your awesome work. But I run the test.py found a error: RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1024 but got size 768 for tensor number 1 in the list. It seems the shape of results of clip encoder and vae encoder are wrong. How can I fix it?

" def forward(self, clip, vae):

clip (1 257 1024)

    vae = self.pool(vae) # 1 4 80 64 --> 1 4 40 32
    vae = rearrange(vae, 'b c h w -> b c (h w)') # 1 4 40 32 --> 1 4 1280
    vae = self.vae2clip(vae) # 1 4 768
   # Concatenate them is difficult
    concat = torch.cat((clip, vae), 1)

johannakarras / DreamPose

Errors with adapter #62

clip (1 257 1024)