XPixelGroup / BasicSR

Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
https://basicsr.readthedocs.io/en/latest/
Apache License 2.0
6.64k stars 1.17k forks source link

Train ESRGAN size tensors problems #409

Open claragarciamoll opened 3 years ago

claragarciamoll commented 3 years ago

Hi @xinntao, I'm trying to train ESRGAN with my own images. The GT images have a size of 256x256x3 and the LQ images have 128x128x3. I already crop both of datasets and I obtain images of 64x64x3 and 32x32x3.

Once I run the code, in the loss function (L1) I get a problem due to the tensors sizes. I attach the problem here: RuntimeError: The size of tensor a (16) must match the size of tensor b (8) at non-singleton dimension 3 . I also attach the tensors sizes: target tensor: torch.Size([8, 3, 16, 16]) and pred tensor: torch.Size([8, 3, 8, 8]).

Do you have any idea, why this is happening? Because as I can understood, the GT images should have higher resolution (in my case x2) with respect to the LQ images.

successhaha commented 2 years ago

Hello, solve this problem need you to change the structure of RRDB, found in basicsr rrdbnet_arch of the arch. then removing one upsample layer. So that matches the tensor dimension.Esrgan defaults to two upsampling layers