KANG99 / style-transfer

Kang根据前辈们的代码用keras、pytorch改写的CNN图像风格迁移
62 stars 14 forks source link

vgg19模型的输入size和图片的size不符合? #1

Open dta0502 opened 6 years ago

dta0502 commented 6 years ago

Welcom!

IndexError Traceback (most recent call last)

in () 5 6 loss_weights={'style':1.0,'content':0.025,'total':1.0} ----> 7 model=vgg19_model(input_tensor) 8 #生成总的反向特征缺失 9 total_loss=total_loss(model,loss_weights,transfer_tensor) in vgg19_model(input_tensor) 2 img_input = Input(tensor = input_tensor, shape = (3, 800, 600, 3)) 3 #Blocks 1 ----> 4 x=Conv2D(64,(3,3),activation='relu',padding='same',name='block1_conv1')(img_input) 5 x=Conv2D(64,(3,3),activation='relu',padding='same',name='block1_conv2')(x) 6 x=MaxPooling2D((2,2),strides=(2,2),name='block1_pooling')(x) c:\program files\python36\lib\site-packages\keras\engine\topology.py in __call__(self, inputs, **kwargs) 636 # Inferring the output shape is only relevant for Theano. 637 if all([s is not None for s in _to_list(input_shape)]): --> 638 output_shape = self.compute_output_shape(input_shape) 639 else: 640 if isinstance(input_shape, list): c:\program files\python36\lib\site-packages\keras\layers\convolutional.py in compute_output_shape(self, input_shape) 193 new_dim = conv_utils.conv_output_length( 194 space[i], --> 195 self.kernel_size[i], 196 padding=self.padding, 197 stride=self.strides[i], IndexError: tuple index out of range
KANG99 commented 6 years ago

sorry,在输入通道中,图片只有长,宽,和表示RGB的高在以tensorflow为backend情况下shape表示(300,400,3),在以theano为backend情况下表示(3,300,400) img_input=Input(tensor=input_tensor,shape=(300,400,3))

LeBronGod commented 2 years ago

作者你好,请问你这个对输入图像的size是有要求的吗 (pytorch)

KANG99 commented 2 years ago

作者你好,请问你这个对输入图像的size是有要求的吗 (pytorch) class StyleTransfer: def init(self,content_image,style_image,style_weight=5,content_weight=0.025): self.vgg19 = models.vgg19() self.vgg19.load_state_dict(torch.load('vgg19-dcbb9e9d.pth')) self.img_ncols = 400 self.img_nrows = 300 self.style_weight = style_weight self.content_weight = content_weight self.content_tensor,self.content_name = self.process_img(content_image) self.style_tensor,self.style_name = self.process_img(style_image) self.conbination_tensor = self.content_tensor.clone()` 可以从这里以及下面处理图片的代码可以看出来,对图片大小虽然没有限制,但是因为我是在自己的本子上跑的,为了节省时间,我会将图片resize成400x300,所以图片最好接近这个比例

LeBronGod commented 2 years ago

因为我跑的时候报了这么一个错 RuntimeError: Given groups=1, weight of size [64, 3, 3, 3], expected input[1, 4, 300, 400] to have 3 channels, but got 4 channels instead 难道不是要一个4通道的吗( expected input[1, 4, 300, 400])

KANG99 commented 2 years ago

因为我跑的时候报了这么一个错 RuntimeError: Given groups=1, weight of size [64, 3, 3, 3], expected input[1, 4, 300, 400] to have 3 channels, but got 4 channels instead 难道不是要一个4通道的吗( expected input[1, 4, 300, 400])

4通道和3通道不是受size影响的,而是格式的问题,三通道(RGB),四通道(RGBA) RuntimeError:给定groups=1,大小[64,3,3,3]的权重,期望输入[1,4,300,400]有3个通道,但得到4个通道
你把图片转换成RGB保存一下再作为输入

LeBronGod commented 2 years ago

如果我想对模型进行优化,加入注意力机制模型,是要对VGG19进行修改吗,不知道作者有没有什么好想法

LeBronGod commented 2 years ago

作者你好能说下你用的什么数据集训练的吗,如果可以的话请求发下训练代码

LeBronGod commented 2 years ago

能回复一下吗,本人学生党(本科,期末作业很急,球球了

KANG99 commented 2 years ago

能回复一下吗,本人学生党(本科,期末作业很急,球球了

最近工作比较忙,可以参考下下面的链接 https://pytorch.org/hub/pytorch_vision_vgg/ 还有就是百度是个好东西 https://www.baidu.com/