dome272 / Diffusion-Models-pytorch

Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)
Apache License 2.0
1.11k stars 256 forks source link

RuntimeError running the model #3

Open Morplson opened 1 year ago

Morplson commented 1 year ago

First of all, love your video.

Trying to run the unconditional model gives me this funny error:

File "f:\Coding Projects\ai\diffusinon-test\modules.py", line 99, in forward return x + emb RuntimeError: The size of tensor a (192) must match the size of tensor b (12) at non-singleton dimension 0

Do you know how to fix it?

dome272 commented 1 year ago

Hey, did you change anything in the code? If so, could you let me know what exactly was changed?

Elmanou89 commented 1 year ago

Hello, I have the same error by changing the input size from 64 to 128. It appears for some reason to change the batch size in the down and up function, I cant figure out why... RuntimeError: Caught RuntimeError in replica 0 on device 0. Original Traceback (most recent call last): File "/local/home/alibaym/anaconda3/envs/mlenv/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 64, in _worker output = module(*input, kwargs) File "/local/home/alibaym/anaconda3/envs/mlenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1357, in _call_impl return forward_call(*input, *kwargs) File "/local/home/alibaym/codes/diffusion_models/modules.py", line 170, in forward x3 = self.down2(x2, t) File "/local/home/alibaym/anaconda3/envs/mlenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1357, in _call_impl return forward_call(input, kwargs) File "/local/home/alibaym/codes/diffusion_models/modules.py", line 99, in forward return x + emb RuntimeError: The size of tensor a (12) must match the size of tensor b (3) at non-singleton dimension 0

Elmanou89 commented 1 year ago

FYI I solved the issue: Main problem is that image size is hardcoded in the Unet definition for the Self attention module. The line x = x.view(-1, self.channels, self.size * self.size).swapaxes(1, 2) Makes the batch size change if the size was not the same. These self attention sizes should be dependant on the image size.

lulmer commented 1 year ago

FYI I solved the issue: Main problem is that image size is hardcoded in the Unet definition for the Self attention module. The line x = x.view(-1, self.channels, self.size * self.size).swapaxes(1, 2) Makes the batch size change if the size was not the same. These self attention sizes should be dependant on the image size.

Encountering the same problem on a custom dataset... Can you give more details about what you had to change in order to get it to work ?

Dunedin87 commented 1 year ago

In the UNet initialization, you'll need to specify SelfAttention image size as the image size of your dataset instead of hardcoded numbers like 32, 16 etc. The way I've changed it is following.

`class UNet(nn.Module): def init(self, c_in=3, c_out=3, time_dim=256, image_size = 256, device="cuda"): super().init()

    self.image_size = image_size
    self.sa1 = SelfAttention(128, self.image_size // 2)
    self.down2 = Down(128, 256)
    self.sa2 = SelfAttention(256, self.image_size // 4)
    self.down3 = Down(256, 256)
    self.sa3 = SelfAttention(256, self.image_size // 8)
    ........

    self.up1 = Up(512, 128)
    self.sa4 = SelfAttention(128, self.image_size // 4)
    self.up2 = Up(256, 64)
    self.sa5 = SelfAttention(64, self.image_size // 2)
    self.up3 = Up(128, 64)
    self.sa6 = SelfAttention(64, self.image_size)
    self.outc = nn.Conv2d(64, c_out, kernel_size=1)

`