Closed alyato closed 6 years ago
呃,任何loss都沒有保證一定會在任意數據集改善啊,特別是我不知道你的數據集是否適用你所用的1024 dim, 8 centers。又或是你的數據集不適用sgd等等,太多變數了。我這只是想顯示數據有聚類的情形而已。
如果不知為何center loss無效,我目前只想到你先用t-SNE畫出你的1024 dim特徵是否經過center loss後有聚類的情形。如果沒有很可能是因為你的數據很特殊無法用這個loss來model。
谢谢。关于8,和1024,的含义是: 我的数据集全部是人脸,每张人脸对应一种疾病。共有8种疾病。 因为是分类,有8种疾病。网络结构中的Dense层设置了1024 dim。
我的猜測有幾個,首先原本center loss是用來作face verification的,也就是說類似VGG這種network到最後的feature很可能是臉部整個的表現模式,而疾病這種class可能不會影響臉部資訊太多。可能是你的center loss把不同臉部的聚類(疾病的class對應到不同人臉,即使有center loss的target可能也會對臉部聚類),而不是把所謂的疾病聚類,因為疾病的特徵可能比較接近texture,或許不是那麼適合。
我的训练精度与验证精度都不错,但是在测试集上的精度只有60%,用训练的图片测试只有56%,看不出问题在哪 训练网络
def Net(image,labels,classes):
## prepare nets ##
base_model = VGG16(input_tensor=image, include_top=False, weights='imagenet')
for layer in base_model.layers[:15]:
layer.trainable = False
x = base_model.output
x = Convolution2D(1024, (1, 1), padding='same', name='add_conv', kernel_initializer='TruncatedNormal')(x)
x = BatchNormalization(name='add_normalization')(x)
x = Activation('relu', name='add_activation')(x)
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu', name='add_fc1', kernel_initializer='TruncatedNormal')(x)
x = Dropout(rate=0.5)(x)
x = Dense(1024, activation='relu', name='add_fc2', kernel_initializer='TruncatedNormal')(x)
x = Dropout(rate=0.5)(x)
logits = Dense(classes, activation='softmax', name='predictions')(x)
feature = Dense(256, name='add_centerloss_fc', kernel_initializer='TruncatedNormal')(x)
feature = PReLU(name='PRelu')(feature)
centers = Embedding(classes, 256)(labels)
l2_loss = Lambda(lambda x: K.sum(K.square(x[0] - x[1][:, 0]), 1, keepdims=True), name='l2_loss')([feature, centers])
return logits, l2_loss
def build_network(images,labels,classes):
# build all networks \ load weights \plot model #
logits, l2_loss = Net(images,labels,classes)
model = Model(inputs=[images, labels], outputs=[logits, l2_loss])
model.summary()
try:
model.load_weights('./simple_center_tmp/.h5',by_name=True)
print('Success load initial weighs!')
except:
print('Could not load model weights!')
plot_model(model, to_file='./simple_center_log/centerloss_model.png',show_shapes=True)
return model
images = Input(shape=(224,224,3))
labels = Input(shape=(586,))
dummy = np.zeros((C.batch_size, 1))
dummy_test = np.zeros((C.batch_size, 1))
model = build_network(images=images,labels=labels,classes=C.classes)
model.compile(optimizer=Adam(), loss=["categorical_crossentropy", lambda y_true, y_pred: y_pred],
loss_weights=[1, C.lambda_centerloss], metrics=['accuracy'])
for e in range(C.n_epochs):
print('Training... Epoch', e)
for n in range(int(train_temp.shape[0] / C.batch_size)):
X_batch, Y_batch = imagehelper.generate_one_from_list(temp=train_temp, means=C.means)
Y_batch_onehot = np_utils.to_categorical(Y_batch, C.classes)
generated_images.fit(X_batch)
gen = generated_images.flow(X_batch, Y_batch, batch_size=C.batch_size,shuffle=False)
X_batch, Y_batch = next(gen)
centerloss = model.train_on_batch([X_batch, Y_batch_onehot],[Y_batch_onehot,dummy],class_weight=train_weight)
C.loss.append(centerloss)
print('Epoch:{} batch:{}/{} total_loss:{} softmax_loss:{} center_loss:{} softmax_acc:{} center_acc:{}'
.format(e, n, int(train_temp.shape[0] / C.batch_size), centerloss[0],centerloss[1],centerloss[2],centerloss[3],centerloss[4]))
write_log(callback, train_names, centerloss, len(C.loss))
if e % 2 == 0 and e != 0:
model.save_weights('./simple_center_tmp/lambda{} centerloss epoch {} {}.h5'.format(C.lambda_centerloss, e, time.ctime()))
# validation nodel #
if C.train_and_val:
val_softmaxlosslist = []
val_centerlosslist = []
val_totallosslist = []
val_softmaxacclist = []
val_centeracclist = []
for n in range(int(val_temp.shape[0] / C.batch_size)):
X_val_batch, Y_val_batch = imagehelper.generate_one_from_list(val_temp, means=C.means)
generated_images.fit(X_val_batch)
gen = generated_images.flow(X_val_batch, Y_val_batch, batch_size=C.batch_size, shuffle=False)
X_val_batch, Y_val_batch = next(gen)
Y_val_batch = np_utils.to_categorical(Y_val_batch, C.classes)
val_loss1 = model.test_on_batch([X_val_batch, Y_val_batch], [Y_val_batch, dummy_test])
print('Epoch:{} batch:{}/{} val_totalloss:{} val_softmaxloss:{} val_centerloss:{} val_softmaxacc:{} val_centeracc:{}'.format(e, n, int(val_temp.shape[0]/C.batch_size), val_loss1[0], val_loss1[1], val_loss1[2], val_loss1[3], val_loss1[4]))
val_softmaxlosslist.append(val_loss1[1])
val_centerlosslist.append(val_loss1[2])
val_totallosslist.append(val_loss1[0])
val_softmaxacclist.append(val_loss1[3])
val_centeracclist.append(val_loss1[4])
val_centerloss = np.mean(np.array(val_centerlosslist))
val_softmaxloss = np.mean(np.array(val_softmaxlosslist))
val_totalloss = np.mean(np.array(val_totallosslist))
val_softmaxacc = np.mean(np.array(val_softmaxacclist))
val_centeracc = np.mean(np.array(val_centeracclist))
val = np.array([val_totalloss, val_softmaxloss, val_centerloss, val_softmaxacc, val_centeracc])
print('Epoch:{} val_softmaxloss:{} val_centerloss:{} val_totalloss:{} val_softmaxacc:{} val_centeracc:{}'.format(e, val_softmaxloss, val_centerloss, val_totalloss, val_softmaxacc, val_centeracc))
write_log(callback, val_names, val, e + 1)
model.save_weights('./simple_center_tmp/cemterloss {}.h5'.format(time.ctime()))
测试网络
############define a classification networks##############
def Net(image,labels,classes):
## prepare nets ##
base_model = VGG16(input_tensor=image, include_top=False)
x = base_model.output
x = Convolution2D(1024, (1, 1), padding='same', name='add_conv', kernel_initializer='TruncatedNormal')(x)
x = BatchNormalization(name='add_normalization')(x)
x = Activation('relu', name='add_activation')(x)
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu', name='add_fc1', kernel_initializer='TruncatedNormal')(x)
x = Dropout(rate=0.5)(x)
x = Dense(1024, activation='relu', name='add_fc2', kernel_initializer='TruncatedNormal')(x)
x = Dropout(rate=0.5)(x)
logits = Dense(classes, activation='softmax', name='predictions')(x)
feature = Dense(256, name='add_centerloss_fc', kernel_initializer='TruncatedNormal')(x)
feature = PReLU(name='PRelu')(feature)
centers = Embedding(classes, 256)(labels)
l2_loss = Lambda(lambda x: K.sum(K.square(x[0] - x[1][:, 0]), 1, keepdims=True), name='l2_loss')([feature, centers])
return logits, l2_loss
def build_network(images,labels,classes):
# build all networks \ load weights \plot model #
logits, l2_loss = Net(images,labels,classes)
model = Model(inputs=[images,labels], outputs=[logits, l2_loss])
model.summary()
# test model #
#test_model = Model(inputs=images,outputs=logits)
#test_model.summary()
#predict_model = Model(inputs=images, outputs=test_model.get_layer('add_dropout2').output)
try:
model.load_weights('./simple_center_tmp/lambda0.1 centerloss epoch 50 Mon May 28 08:53:55 2018.h5')
print('Success load initial weighs!')
except:
print('Could not load model weights!')
return model
################## define networks end #####################
images = Input(shape=(224,224,3))
labels = Input(shape=(586,))
model = build_network(images=images,labels=labels,classes=C.classes)
model.compile(optimizer=Adam(),
loss=['categorical_crossentropy',lambda y_true, y_pred: y_pred],
metrics=['accuracy'])
# validation nodel #
if C.test:
val_softmaxlosslist = []
val_softmaxacclist = []
dummy_test = np.zeros((C.batch_size, 1))
for n in range(int(val_temp.shape[0] / C.batch_size)):
X_val_batch, Y_val_batch = Gen.generate_one_from_list(val_temp, means=C.valmeans)
Y_val_batch = np_utils.to_categorical(Y_val_batch, C.classes)
#val_loss1 = test_model.test_on_batch(X_val_batch, Y_val_batch)
val_loss1 = model.test_on_batch([X_val_batch, Y_val_batch], [Y_val_batch, dummy_test])
print(val_loss1)
#pre = model.predict_on_batch(X_val_batch)
print('batch:{}/{} val_softmaxloss:{} val_softmaxacc:{}'.format(n, int(val_temp.shape[0]/C.batch_size), val_loss1[1], val_loss1[3]))
val_softmaxlosslist.append(val_loss1[1])
val_softmaxacclist.append(val_loss1[3])
val_softmaxloss = np.mean(np.array(val_softmaxlosslist))
val_softmaxacc = np.mean(np.array(val_softmaxacclist))
print('TEST: val_softmaxloss:{} val_softmaxacc:{}'.format(val_softmaxloss, val_softmaxacc))
@liuchuanloong
谢谢。现在我正在查找问题,同时shamangary也给了建议。
还有谢谢你分享你的代码。
你的代码和我的不同,我想通过你的网络结构,来运行自己的数据集,看看是不是有所提高。
看了你的代码,有些困惑。
labels = Input(shape=(586,))
这个为啥是586?
标签共有586类,所以是586行1列的张量 batch_size 不用写 我的测试代码不知道是不是有问题,测试的结果和训练验证的结果相差很大,如果你解决了可不可以分享一下
@liuchuanloong 我觉得和TYY_mnist.py 中的代码不一样。
input_target = Input(shape=(1,)) # single value ground truth labels as inputs centers = Embedding(10,2)(input_target)
而你的是labels = Input(shape=(586,))
,我觉得应该是1.
不知道理解的对不对。
@shamangary
谢谢。
由於Embedding吃的是單一值的label而不是one-hot的label所以@alyato講的是對的喔
https://github.com/keras-team/keras/issues/4981 這裡有討論如何把one-hot變成單一值,使用np.argmax or K.argmax皆可
改了你指出的错误,但是测试的结果依然只有60%,而训练验证的结果都达到了80%,用训练集直接测试我的结果还是60%,怀疑是我的测试网络有问题,但是看不出问题在哪,还请赐教
抱歉我也看不出來,理論上來說如果連validation都不錯應該testing不會太差才對,我建議你檢查你的dataset在輸入前是否有什麼問題,建議用降維的t-SNE畫出你的testing feature和training的聚類結果是否有重疊到。否則這樣只看code看不出什麼東西來。
好的,感谢你的提醒,找到问题所在了
@liuchuanloong
不知道你是怎么解决的。
我看了你的代码,发现和我的不一样,不知道是不是这样才导致我的结果不好。
我提的问题中已经列出代码了,我现在在复制到下面来。
希望你看一下有什么不同。
我的codes:
网络结构如下
base_model = VGG16(weights='imagenet',include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = BatchNormalization(axis=1)(x)
ip1 = Dense(1024,activation='relu')(x)
predictions = Dense(8,activation='softmax')(ip1)
model = Model(input=base_model.input,output=predictions)
for layer in base_model.layers: layer.trainable = True
sgd = SGD(lr=0.0001, decay=1e-6, momentum=0.9,nesterov=True) model.compile(optimizer='rmsprop',loss='categorical_crossentropy')
model.compile函数
然后我定义center_loss函数,如下:
isCenterloss = True
if isCenterloss: lambda_c = 0.2 input_target = Input(shape=(1,)) centers = Embedding(8,1024)(input_target) l2_loss = Lambda(lambda x: K.sum(K.square(x[0]- x[1[:,0]),1,keepdims=True),name='l2_loss') model_centerloss = Model(inputs=[base_model.input,input_target],outputs=[predictions,l2_loss]) sgd_1 = SGD(lr=0.00001, decay=1e-6, momentum=0.9,nesterov=True) model_centerloss.compile(optimizer=sgd_1, loss=["categorical_crossentropy", lambda y_true,,y_pred: y_pred],loss_weights=[1,lambda_c],metrics=['accuracy'])
下,我又定义了model_centerloss.compile
之后,是训练过程 if isCenterloss: random_y_train = np.random.rand(x_train.shape[0],1) random_y_test = np.random.rand(x_test.shape[0],1) model_centerloss.fit([x_train,y_train_value], [y_train, random_y_train], batch_size=batch_size,epochs=nb_epoch, verbose=1, validation_data=([x_test,y_test_value], [y_test,random_y_test]))
我是参考TYY_mnist.py的格式写的
@shamangary
@liuchuanloong 请问一下,因为有两次的model.compile,和model_centerloss.compile
,会不会有影响。毕竟在model_centerloss.compile 前定义了SGD,设置了learning rate
谢谢。
好的,感谢你的提醒,找到问题所在了
请问能分享下怎么解决的吗,我现在也遇到这个问题.
hi @shamangary 我用我自己的数据集,训练。在测试集上的结果很差,只有14%,而且发现dense_2_acc 不提高,最高到19%。如果,我不用center_loss,在测试集上的准确率有50%。 加载数据集
this is the center_loss
isCenterloss = True
预测代码如下:
我调整了learning_rate = 0.00001,可是效果不好,dense_2_acc 不提高。 请问一下,是哪方面的问题。谢谢。