RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead.

Pudding-0503 commented 1 year ago

I am trying to do unpaired image to image translation on my own dataset, which consists of 256*256 eight-bit grayscale images. I changed the 85: 'image_shape' : (3, 256, 256), to 'image_shape' : (1, 256, 256) in the bert_A2B-256.py file. But at runtime I get this error RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead. What could be the cause of this error? Thank you very much!

usert5432 commented 1 year ago

Hi @Pudding-0503,

Thank you for reporting this issue!

I think, this error message:

But at runtime I get this error RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead.

indicates that the network was constructed correctly, and it expects grayscale images. However, the actual dataset is not grayscale, but has 3 color channels. To help us to find the source of this issue, could I ask you please to share the data configuration part of your bert_A2B-256.py file (or the file itself if it is possible) ?

Pudding-0503 commented 1 year ago

你好@Pudding-0503,

感谢您报告此问题！

我认为，此错误消息：

但是在运行时我得到这个错误 RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead.

表示网络构建正确，它需要灰度图像。然而，实际数据集不是灰度的，而是有 3 个颜色通道。为了帮助我们找到这个问题的根源，我能否请您分享data您bert_A2B-256.py文件的配置部分（或文件本身，如果可能的话）？

真诚感谢您的回复，我只进行了最小的改动（图像为灰度文件，因此我暂时注释掉了关于颜色的数据增强代码，我不知道这会不会有影响），bert_sketch2lineart-256.py文件如下：

import argparse
import os

from uvcgan import ROOT_OUTDIR, train
from uvcgan.utils.parsers import add_preset_name_parser, add_batch_size_parser

def parse_cmdargs():
    parser = argparse.ArgumentParser(description = 'Train Sketch2Lineart BERT')
    add_preset_name_parser(parser, 'gen', GEN_PRESETS, 'vit-unet-12')
    add_batch_size_parser(parser, default = 64)
    return parser.parse_args()

GEN_PRESETS = {
    'resnet9' : {
        'model'      : 'resnet_9blocks',
        'model_args' : None,
    },
    'unet' : {
        'model'      : 'unet_256',
        'model_args' : None,
    },
    'vit-unet-6' : {
        'model' : 'vit-unet',
        'model_args' : {
            'features'           : 384,
            'n_heads'            : 6,
            'n_blocks'           : 6,
            'ffn_features'       : 1536,
            'embed_features'     : 384,
            'activ'              : 'gelu',
            'norm'               : 'layer',
            'unet_features_list' : [48, 96, 192, 384],
            'unet_activ'         : 'leakyrelu',
            'unet_norm'          : 'instance',
            'unet_downsample'    : 'conv',
            'unet_upsample'      : 'upsample-conv',
            'rezero'             : True,
            'activ_output'       : 'sigmoid',
        },
    },
    'vit-unet-12' : {
        'model' : 'vit-unet',
        'model_args' : {
            'features'           : 384,
            'n_heads'            : 6,
            'n_blocks'           : 12,
            'ffn_features'       : 1536,
            'embed_features'     : 384,
            'activ'              : 'gelu',
            'norm'               : 'layer',
            'unet_features_list' : [48, 96, 192, 384],
            'unet_activ'         : 'leakyrelu',
            'unet_norm'          : 'instance',
            'unet_downsample'    : 'conv',
            'unet_upsample'      : 'upsample-conv',
            'rezero'             : True,
            'activ_output'       : 'sigmoid',
        },
    },
}

cmdargs   = parse_cmdargs()
args_dict = {
    'batch_size' : cmdargs.batch_size,
    'data' : {
        'dataset' : 'cyclegan',
        'dataset_args'   : {
            'path'        : 'sketch2lineart',
            'align_train' : False,
        },
        'transform_train' : [
            { 'name' : 'resize',          'size'    : 286, },
            { 'name' : 'random-rotation', 'degrees' : 10,  },
            { 'name' : 'random-crop',     'size'    : 256, },
            'random-flip-horizontal',
            # {
            #     'name' : 'color-jitter',
            #     'brightness' : 0.2,
            #     'contrast'   : 0.2,
            #     'saturation' : 0.2,
            #     'hue'        : 0.2,
            # },
        ],
    },
    'image_shape' : (1, 256, 256),
    'epochs'      : 2499,
    'discriminator' : None,
    'generator' : {
        **GEN_PRESETS[cmdargs.gen],
        'optimizer'  : {
            'name'  : 'AdamW',
            'lr'    : cmdargs.batch_size * 5e-3 / 512,
            'betas' : (0.9, 0.99),
            'weight_decay' : 0.05,
        },
        'weight_init' : {
            'name'      : 'normal',
            'init_gain' : 0.02,
        }
    },
    'model'      : 'autoencoder',
    'model_args' : {
        'joint'   : True,
        'masking' : {
            'name'       : 'image-patch-random',
            'patch_size' : (32, 32),
            'fraction'   : 0.4,
        },
    },
    'scheduler' : {
        'name'      : 'CosineAnnealingWarmRestarts',
        'T_0'       : 500,
        'T_mult'    : 1,
        'eta_min'   : cmdargs.batch_size * 5e-8 / 512,
    },
    'loss'             : 'l1',
    'gradient_penalty' : None,
    'steps_per_epoch'  : 32 * 1024 // cmdargs.batch_size,
# args
    'label'      : f'bert-{cmdargs.gen}-256',
    'outdir'     : os.path.join(ROOT_OUTDIR, 'sketch2lineart'),
    'log_level'  : 'DEBUG',
    'checkpoint' : 100,
}

train(args_dict)

Also I randomly checked the images in trainA and they are indeed single channel pictures：

Thanks again for your help!

usert5432 commented 1 year ago

Thank you @Pudding-0503!

Your file looks good, and I can reproduce the problem! Let me see what we can do to fix it.

Pudding-0503 commented 1 year ago

Thank you so much @usert5432, It's a bit difficult for me to write the code, if you are willing to fix this little bug, I will sincerely appreciate you! ><

usert5432 commented 1 year ago

Hi @Pudding-0503.

If this issue is still relevant -- we have added another dataset cyclegan-v2, which handles the grayscale images correctly. I think you can make the grayscale training work by modifying the bert_A2B-256.py and replacing

dataset' : 'cyclegan',

with

dataset' : 'cyclegan-v2',

Pudding-0503 commented 1 year ago

Hello @usert5432 I have tried cyclegan-v2 provided by you and done training UVCGAN, cyclegan, ACL-GAN and UGATIT on my own dataset. The results show that UVCGAN is still the best on this new task, sincerely thank you for your research work!!

LS4GAN / uvcgan

RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead. #17