PaddlePaddle / models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
Apache License 2.0
6.92k stars 2.91k forks source link

反向传播不更新参数,跪求大佬指点迷津 #1770

Open xyq019971 opened 5 years ago

xyq019971 commented 5 years ago

b21=enet.block4(b20) b22= fluid.layers.transpose(b21, perm=[0, 2, 3, 1]) out1=fluid.layers.reshape(b22,shape=[-1,9]) out2=fluid.layers.reshape(y,shape=[-1,1])

out3=paddle.fluid.layers.softmax(out1)

_, out4 = fluid.layers.topk(out1, k=1) out5 = fluid.layers.cast(out4, dtype='float32') loss=paddle.fluid.layers.square_error_cost(out5, out2) avg=fluid.layers.reduce_mean(loss) regularizer = fluid.regularizer.L2Decay(0.0001) optimizer = paddle.fluid.optimizer.AdamOptimizer(learningrate=0.01, beta1=0.9, beta2=0.999, epsilon=1e-08, regularization=regularizer) , params_grads = optimizer.minimize(avg)

代码很简单就是一个9通道的图进来之后做一个语意分割反向传播,程序一直可以运行,但是就是参数不更新,如果用1张图片训练avg永远都是那个数字不变。

SunGaofeng commented 5 years ago

能贴下完整代码吗? 需要看下创建data op部分是否有问题,还有网络结构完成之后,需要调用exe.run(main_program, fetch_list, feed),需要看一下这地方代码是否有问题导致feed或者fetch_list不对

xyq019971 commented 5 years ago

代码写了几个文件,我发你邮箱可以吗,你有邮箱吗

xyq019971 commented 5 years ago

能贴下完整代码吗? 需要看下创建data op部分是否有问题,还有网络结构完成之后,需要调用exe.run(main_program, fetch_list, feed),需要看一下这地方代码是否有问题导致feed或者fetch_list不对

大佬你有邮箱吗,我发你邮箱可以吗,东西分开写的,我重新粘贴下也行,你少等

SunGaofeng commented 5 years ago

sungaofeng@baidu.com

xyq019971 commented 5 years ago

import paddle.fluid as fluid import paddle import numpy as np import cv2 import time import glob

def getlist(): path='./data2/' trainlist=glob.glob(path + '/*.jpg') num=len(trainlist) return trainlist,num

def enettrain(trainlist,num,batchsize,w,h): arr=np.arange(num) np.random.shuffle(arr) trainlistv2=[] labellist=[] arr=arr[0:batchsize] for i in arr: trainlistv2.append(trainlist[i]) mid=trainlist[i] label=mid.replace(".jpg","_bin.png") label1=label.replace("data2","save2") labellist.append(label1) trainimgs=[] labelimgs=[] print(trainlistv2) print(labellist) for i in trainlistv2: trainimg=cv2.imread(i) trainimg = cv2.resize(trainimg, (w, h))

trainimg=trainimg[400:1600,800:2000,:]

    trainimg=trainimg.transpose((2,0,1))
    trainimgs.append(trainimg)
for i in labellist:
    labelimg=cv2.imread(i)
    labelimg = cv2.resize(labelimg, (w, h))
    #labelimg=labelimg[400:1600,800:2000,:]
    labelimg=labelimg.transpose((2,0,1))
    labelimgs.append(labelimg[0,:,:])
trainimgs=np.asarray(trainimgs, np.float32)
labelimgs=np.asarray(labelimgs, np.float32)
labelimgs=labelimgs[:, np.newaxis,:,:]
return trainimgs,labelimgs

def initial(x): conv2d = fluid.layers.conv2d(x,num_filters=13,filter_size=[3, 3],stride=[2, 2],padding=1,groups=1)#神经网络的结构和优化方法 temp = fluid.layers.pool2d( x, pool_size=[3, 3], pool_type="max", pool_stride=[2, 2], pool_padding=1, ) y=paddle.fluid.layers.concat(input=[temp,conv2d], axis=1, name=None) return y

*****

降采样,尺寸缩小为原来的二分之一

*****

def downsampling(x,f): y = fluid.layers.conv2d(x,num_filters=f,filter_size=[3, 3],stride=[2, 2],padding=1,groups=1) return y

***

普通卷积,第一个参数为输入,第二个参数为通道数,输出尺寸不变

***

def block1(x,f): y = fluid.layers.conv2d(x,num_filters=int(f/4),filter_size=[1, 1],stride=[1, 1],groups=1) y1 = paddle.fluid.layers.prelu(y, mode='channel') y2 = fluid.layers.conv2d(y1,num_filters=int(f/4),filter_size=[3, 3],stride=[1, 1],padding=1,groups=1) y3 = paddle.fluid.layers.prelu(y2, mode='channel') y4 = fluid.layers.conv2d(y3,num_filters=int(f),filter_size=[1, 1],stride=[1, 1],groups=1) y5 = fluid.layers.batch_norm(y4, momentum=0.95, epsilon=1e-5)

y6 = fluid.layers.conv2d(x,num_filters=int(f),filter_size=[1, 1],stride=[1, 1],groups=1)
y7 = fluid.layers.batch_norm(y6, momentum=0.95, epsilon=1e-5)
y8=y5+y7
y9 = paddle.fluid.layers.prelu(y8, mode='channel')
return y9

****

膨胀卷积,第一个参数为输入,第二个参数为通道数,第三个是膨胀,第四个是pad,输出尺寸不变

****

def block2(x,f,d,p): #膨胀卷积 y = fluid.layers.conv2d(x,num_filters=int(f/4),filter_size=[1, 1],stride=[1, 1],groups=1) y1 = paddle.fluid.layers.prelu(y, mode='channel') y2 = fluid.layers.conv2d(y1,num_filters=int(f/4),filter_size=[3, 3],dilation=d,stride=[1, 1],padding=p,groups=1) y3 = paddle.fluid.layers.prelu(y2, mode='channel') y4 = fluid.layers.conv2d(y3,num_filters=int(f),filter_size=[1, 1],stride=[1, 1],groups=1) y5 = fluid.layers.batch_norm(y4, momentum=0.95, epsilon=1e-5)

y6 = fluid.layers.conv2d(x,num_filters=int(f),filter_size=[1, 1],stride=[1, 1],groups=1)
y7 = fluid.layers.batch_norm(y6, momentum=0.95, epsilon=1e-5)
y8=y5+y7
y9 = paddle.fluid.layers.prelu(y8, mode='channel')
return y9

**

不对称卷积,第一个参数为输入,第二个参数为通道数,第三个是膨胀,第四个是pad,输出尺寸不变

**

def block3(x,f,size,pad): #非堆成卷积 y = fluid.layers.conv2d(x,num_filters=int(f/4),filter_size=[1, 1],stride=[1, 1],groups=1) y1 = paddle.fluid.layers.prelu(y, mode='channel') y2 = fluid.layers.conv2d(y1,num_filters=int(f/4),filter_size=size,padding=pad,stride=[1, 1] ,groups=1) y3 = paddle.fluid.layers.prelu(y2, mode='channel') y4 = fluid.layers.conv2d(y3,num_filters=int(f),filter_size=[1, 1],stride=[1, 1],groups=1) y5 = fluid.layers.batch_norm(y4, momentum=0.95, epsilon=1e-5)

y6 = fluid.layers.conv2d(x,num_filters=int(f),filter_size=[1, 1],stride=[1, 1],groups=1)
y7 = fluid.layers.batch_norm(y6, momentum=0.95, epsilon=1e-5)
y8=y5+y7
y9 = paddle.fluid.layers.prelu(y8, mode='channel')
return y9

**

上采样1倍,第一个参数为输入,第二个参数为通道数

**

def upsampling(x,f): conv2d =fluid.layers.conv2d_transpose(x,f, filter_size=2,stride=2 ) return conv2d

x = fluid.layers.data(name='x', shape=[3,200,200],dtype='float32')#数据层python to paddle y = fluid.layers.data(name='y', shape=[1],dtype='int64')

*****

网络结构

*****

u=initial(x) #初始化,输出降采样为1/2 z=downsampling(u,64)#64为通道数,降采样1/2 b1=block1(z,64)#64为通道数 b2=block1(b1,64)#64为通道数 b3=block1(b2,64)#64为通道数 b4=block1(b3,64)#64为通道数 b5=downsampling(b4,64)#64为通道数,降采样1/2 b6=block1(b5,128)#128为通道数 b7=block2(b6,128,2,2)#128为通道数,第三个参数是膨胀数,但三个是padding数 b8=block3(b7,128,[5,1],[2,0])#第三个为卷积核尺寸,第四个为padding尺寸 b9=block2(b8,128,4,4)#128为通道数,第三个参数是膨胀数,但三个是padding数 b10=block1(b9,128)#128为通道数 b11=block2(b10,128,8,8)#128为通道数,第三个参数是膨胀数,但三个是padding数 b12=block3(b11,128,[1,5],[0,2])#第三个为卷积核尺寸,第四个为padding尺寸 b13=block2(b12,128,16,16)#128为通道数,第三个参数是膨胀数,但三个是padding数 b14=upsampling(b13,64)#上采样1倍 b15=block1(b14,64)#64为通道数 b16=block1(b15,64)#64为通道数 b17=upsampling(b16,16)#上采样1倍 b18=block1(b17,16)#128为通道数 b19=upsampling(b18,9)#上采样1倍

*****

损失函数和优化方法

*****

'''out1=fluid.layers.reshape(b19,shape=[-1,1]) out2=fluid.layers.reshape(y,shape=[-1,1]) loss=paddle.fluid.layers.square_error_cost(out1, out2)'''

b22= fluid.layers.transpose(b19, perm=[0, 2, 3, 1]) out1=fluid.layers.reshape(b22,shape=[-1,9]) out2=fluid.layers.reshape(y,shape=[-1,1]) out3=paddle.fluid.layers.softmax(out1) _, out4 = fluid.layers.topk(out3, k=1) out5 = fluid.layers.cast(out4, dtype='float32')

loss=paddle.fluid.layers.square_error_cost(out5, out2) avg=fluid.layers.reduce_mean(loss) regularizer = fluid.regularizer.L2Decay(0.0001) optimizer = paddle.fluid.optimizer.AdamOptimizer(learningrate=0.01, beta1=0.9, beta2=0.999, epsilon=1e-08, regularization=regularizer) , params_grads = optimizer.minimize(avg)

avg=fluid.layers.reduce_mean(loss) regularizer = fluid.regularizer.L2Decay(0.0001) optimizer = fluid.optimizer.Momentum(learningrate=0.1, momentum=0.9, regularization=regularizer) , params_grads = optimizer.minimize(avg,no_grad_set=['y'])

place = fluid.CPUPlace()#CUDA exe = fluid.executor.Executor(place)

进行参数初始化

exe.run(fluid.default_startup_program())#第三部分 main_program = fluid.default_main_program()

trainlist,num=getlist()

for i in range(10): start=time.time() img,label=enettrain(trainlist,num,1,200,200) b,c,d= exe.run(program=main_program, feed={'x': img,'y':label}, fetch_list=[b19,out2,avg]) #第五部分paddle to python end=time.time() end=end-start print("Pass:%d, Cost:%0.5f,time:%f6" % (i, d,end)) if i%2==0: fluid.io.save_inference_model(dirname='./weight/'+str(i)+'/', feeded_var_names=['x'], target_vars=[b19], executor=exe,main_program=main_program) b=np.asarray(b, np.int32)

c=np.asarray(c, np.float32)

print(b.shape) print(c.shape) print(d.shape)

SunGaofeng commented 5 years ago

_, out4 = fluid.layers.topk(out3, k=1)这个地方用了topk op,目前paddle里面topk op是没有backward的。所以out4以及其前面的variable都不会被计算梯度,导致网络没有被更新。