Closed Pudding-0503 closed 1 year ago
Hi @Pudding-0503,
Thank you for reporting this issue!
I think, this error message:
But at runtime I get this error RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead.
indicates that the network was constructed correctly, and it expects grayscale images. However, the actual dataset is not grayscale, but has 3 color channels. To help us to find the source of this issue, could I ask you please to share the data
configuration part of your bert_A2B-256.py
file (or the file itself if it is possible) ?
你好@Pudding-0503,
感谢您报告此问题!
我认为,此错误消息:
但是在运行时我得到这个错误 RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead.
表示网络构建正确,它需要灰度图像。然而,实际数据集不是灰度的,而是有 3 个颜色通道。为了帮助我们找到这个问题的根源,我能否请您分享
data
您bert_A2B-256.py
文件的配置部分(或文件本身,如果可能的话)?
真诚感谢您的回复,我只进行了最小的改动(图像为灰度文件,因此我暂时注释掉了关于颜色的数据增强代码,我不知道这会不会有影响),bert_sketch2lineart-256.py文件如下:
import argparse
import os
from uvcgan import ROOT_OUTDIR, train
from uvcgan.utils.parsers import add_preset_name_parser, add_batch_size_parser
def parse_cmdargs():
parser = argparse.ArgumentParser(description = 'Train Sketch2Lineart BERT')
add_preset_name_parser(parser, 'gen', GEN_PRESETS, 'vit-unet-12')
add_batch_size_parser(parser, default = 64)
return parser.parse_args()
GEN_PRESETS = {
'resnet9' : {
'model' : 'resnet_9blocks',
'model_args' : None,
},
'unet' : {
'model' : 'unet_256',
'model_args' : None,
},
'vit-unet-6' : {
'model' : 'vit-unet',
'model_args' : {
'features' : 384,
'n_heads' : 6,
'n_blocks' : 6,
'ffn_features' : 1536,
'embed_features' : 384,
'activ' : 'gelu',
'norm' : 'layer',
'unet_features_list' : [48, 96, 192, 384],
'unet_activ' : 'leakyrelu',
'unet_norm' : 'instance',
'unet_downsample' : 'conv',
'unet_upsample' : 'upsample-conv',
'rezero' : True,
'activ_output' : 'sigmoid',
},
},
'vit-unet-12' : {
'model' : 'vit-unet',
'model_args' : {
'features' : 384,
'n_heads' : 6,
'n_blocks' : 12,
'ffn_features' : 1536,
'embed_features' : 384,
'activ' : 'gelu',
'norm' : 'layer',
'unet_features_list' : [48, 96, 192, 384],
'unet_activ' : 'leakyrelu',
'unet_norm' : 'instance',
'unet_downsample' : 'conv',
'unet_upsample' : 'upsample-conv',
'rezero' : True,
'activ_output' : 'sigmoid',
},
},
}
cmdargs = parse_cmdargs()
args_dict = {
'batch_size' : cmdargs.batch_size,
'data' : {
'dataset' : 'cyclegan',
'dataset_args' : {
'path' : 'sketch2lineart',
'align_train' : False,
},
'transform_train' : [
{ 'name' : 'resize', 'size' : 286, },
{ 'name' : 'random-rotation', 'degrees' : 10, },
{ 'name' : 'random-crop', 'size' : 256, },
'random-flip-horizontal',
# {
# 'name' : 'color-jitter',
# 'brightness' : 0.2,
# 'contrast' : 0.2,
# 'saturation' : 0.2,
# 'hue' : 0.2,
# },
],
},
'image_shape' : (1, 256, 256),
'epochs' : 2499,
'discriminator' : None,
'generator' : {
**GEN_PRESETS[cmdargs.gen],
'optimizer' : {
'name' : 'AdamW',
'lr' : cmdargs.batch_size * 5e-3 / 512,
'betas' : (0.9, 0.99),
'weight_decay' : 0.05,
},
'weight_init' : {
'name' : 'normal',
'init_gain' : 0.02,
}
},
'model' : 'autoencoder',
'model_args' : {
'joint' : True,
'masking' : {
'name' : 'image-patch-random',
'patch_size' : (32, 32),
'fraction' : 0.4,
},
},
'scheduler' : {
'name' : 'CosineAnnealingWarmRestarts',
'T_0' : 500,
'T_mult' : 1,
'eta_min' : cmdargs.batch_size * 5e-8 / 512,
},
'loss' : 'l1',
'gradient_penalty' : None,
'steps_per_epoch' : 32 * 1024 // cmdargs.batch_size,
# args
'label' : f'bert-{cmdargs.gen}-256',
'outdir' : os.path.join(ROOT_OUTDIR, 'sketch2lineart'),
'log_level' : 'DEBUG',
'checkpoint' : 100,
}
train(args_dict)
Also I randomly checked the images in trainA and they are indeed single channel pictures:
Thanks again for your help!
Thank you @Pudding-0503!
Your file looks good, and I can reproduce the problem! Let me see what we can do to fix it.
Thank you so much @usert5432, It's a bit difficult for me to write the code, if you are willing to fix this little bug, I will sincerely appreciate you! ><
Hi @Pudding-0503.
If this issue is still relevant -- we have added another dataset cyclegan-v2
, which handles the grayscale images correctly. I think you can make the grayscale training work by modifying the bert_A2B-256.py
and replacing
dataset' : 'cyclegan',
with
dataset' : 'cyclegan-v2',
Hello @usert5432
I have tried cyclegan-v2
provided by you and done training UVCGAN, cyclegan, ACL-GAN and UGATIT on my own dataset. The results show that UVCGAN is still the best on this new task, sincerely thank you for your research work!!
I am trying to do unpaired image to image translation on my own dataset, which consists of 256*256 eight-bit grayscale images. I changed the 85: 'image_shape' : (3, 256, 256), to 'image_shape' : (1, 256, 256) in the bert_A2B-256.py file. But at runtime I get this error RuntimeError: Given groups=1, weight of size [48, 1, 3, 3], expected input[64, 3, 256, 256] to have 1 channels, but got 3 channels instead. What could be the cause of this error? Thank you very much!