Closed SpenserCai closed 6 years ago
I just implemented the loss, maybe you can have it a try.
#!/usr/bin/python
#_*_ coding:utf8 _*_
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import functools
from keras import backend as K
import tensorflow as tf
def _center_loss_func(features, labels, alpha, num_classes):
feature_dim = features.get_shape()[1]
# Each output layer use one independed center: scope/centers
centers = K.zeros([num_classes, feature_dim])
labels = K.reshape(labels, [-1])
labels = tf.to_int32(labels)
centers_batch = tf.gather(centers, labels)
diff = (1 - alpha) * (centers_batch - features)
centers = tf.scatter_sub(centers, labels, diff)
loss = tf.reduce_mean(K.square(features - centers_batch))
return loss
def get_center_loss(alpha, num_classes):
"""Center loss based on the paper "A Discriminative
Feature Learning Approach for Deep Face Recognition"
(http://ydwen.github.io/papers/WenECCV16.pdf)
"""
@functools.wraps(_center_loss_func)
def center_loss(y_true, y_pred):
return _center_loss_func(y_pred, y_true, alpha, num_classes)
return center_loss
usage:
center_loss = get_center_loss(0.5, num_classes)
model.compile(optimizer='sgd', loss = center_loss)
...
@wangxianliang K.zeros([num_classes, feature_dim]) always set centers to zeros def _center_loss_func(features, labels, alpha, num_classes): feature_dim = features.get_shape()[1]
centers = K.zeros([num_classes, feature_dim])
**centers= centers+1**
**loss = tf.reduce_mean(centers)**
return loss
always outputs 1
even centers = tf.get_variable('centersl', [num_classes, feature_dim], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) tf.get_variable_scope().reuse_variables() dose not work
@fchollet How dose keras save a reused variable and update it by myself according to inputs and model predicitons . I have tried layers(like ocr example), loss functions, regulator, callback.
def _center_loss_func(features, labels, alpha, num_classes,
centers, feature_dim):
assert feature_dim == features.get_shape()[1]
labels = K.reshape(labels, [-1])
labels = tf.to_int32(labels)
centers_batch = tf.gather(centers, labels)
diff = (1 - alpha) * (centers_batch - features)
centers = tf.scatter_sub(centers, labels, diff)
loss = tf.reduce_mean(K.square(features - centers_batch))
return loss
def get_center_loss(alpha, num_classes, feature_dim):
"""Center loss based on the paper "A Discriminative
Feature Learning Approach for Deep Face Recognition"
(http://ydwen.github.io/papers/WenECCV16.pdf)
"""
# Each output layer use one independed center: scope/centers
centers = K.zeros([num_classes, feature_dim])
@functools.wraps(_center_loss_func)
def center_loss(y_true, y_pred):
return _center_loss_func(y_pred, y_true, alpha,
num_classes, centers, feature_dim)
return center_loss
@wangxianliang Dose it update centers when testing?
The implementation above calculate center loss on model's output layer, but how to get center loss on feature layer?
@wyxpku If you want to get both of the center loss and cross entropoy loss, you can add the feature layer as the output of model.
# the 'features' layer should be defined in a network
model = Model(X, [y, features], name=name)
In this case, you should fit the input corresponding to the outputs, and I cloned the ouput.
def clone_y_generator(generator):
# output: train_gen_X, [train_gen_Y, train_gen_Y]
while True:
data = next(generator)
x = data[0]
y = [data[1], data[1]]
yield x, y
usage:
train_gener = train_gen.flow_from_directory(train_dir, ... )
self.model.fit_generator(clone_y_generator(train_gener), ...)
Thank you for your answer @JihoonJ
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
@JihoonJ @wyxpku
Just want to confirm, if I put the center loss in the feature layer, will the center gets updated in every step?
No, it doesn't seem to work. I did some tests and this center-loss implementation doesn't update the centers.
The following command should be executed after every update:
centers = tf.scatter_sub(centers, labels, diff)
E.g. like in this example of a tensorflow-implementation of the center-loss:
https://github.com/EncodeTS/TensorFlow_Center_Loss/blob/master/mnist_sample_code/mnist_with_center_loss.ipynb
It is done like this (see the section "Optimizer"):
with tf.control_dependencies([centers_update_op]):
train_op = optimizer.minimize(total_loss, global_step=global_step)
My question therefore is: Can this be done somehow with keras?
Thank you very much
I've tested the center loss implementation given by @wangxianliang with MNIST and it works somehow, since the results are quite different from the results of a model that uses only cross-entropy loss. It's possible to see in the image below that the different classes are indeed clustered around their corresponding centers in the 2D space.
However, I'm facing problems with this implementation when trying to resume the training from a saved checkpoint, because I can't get the values of the centers after the training, that are needed to instantiate a center loss function with centers different from zero. Any ideas on how to get the value after the training and save it?
Another possibility would be to move the centers variable to a layer or an optimizer, but I have no clue how to do it.
@kutoga this can be solved if you define the loss not via function but via defining your own layer as a subclass of keras.engine.topology.Layer. In this case, when you call tf.scatter_sub
(or K.update_sub
if you prefer more Keras-way approach) you obtain an operation that can be placed in the computational graph using self.add_update
function in your layer implementation.
This will update your proxies but you can see now that your loss is actually a Layer and therefore you need to provide some additional dummy loss function that will just pass value from center loss layer forward so the gradients will be calculated properly for the whole network.
@fabiocapsouza, could you please share the exact code of how you got this working?
@kjanjua26 unfortunately I can't share the exact code because it belongs to the company I used to work at the time. But I used the functions that @wangxianliang gave in the 3rd post of this thread.
@fabiocapsouza, I used the exact same function but the issue is it gives shape error for the first code he wrote, 128,10 vs 1280,10. I am not sure how to resolve that.
@kjanjua26 I had the same problem. I found if you're using categorical labels (one hot encoding) changing the labels = K.reshape(labels, [-1])
line to labels = tf.argmax(labels, axis=1)
fixed it.
Closing as this is resolved
@kjanjua26 I had the same problem. I found if you're using categorical labels (one hot encoding) changing the
labels = K.reshape(labels, [-1])
line tolabels = tf.argmax(labels, axis=1)
fixed it.
Thank you very much!But I don't konw why?
@kjanjua26 I had the same problem. I found if you're using categorical labels (one hot encoding) changing the
labels = K.reshape(labels, [-1])
line tolabels = tf.argmax(labels, axis=1)
fixed it.Thank you very much!But I don't konw why?
Because labels comes from the "Y" target you're using. In this case, being classified, you have it encoded as a one-hot-encoding. The _center_loss_func really receives:
The problem that i find in this kind of implementation is that in validation split the center will be move. I dont know if the are any flag/control to know if the loss function is in the validation step.
@kjanjua26 I had the same problem. I found if you're using categorical labels (one hot encoding) changing the
labels = K.reshape(labels, [-1])
line tolabels = tf.argmax(labels, axis=1)
fixed it.Thank you very much!But I don't konw why?
Because labels comes from the "Y" target you're using. In this case, being classified, you have it encoded as a one-hot-encoding. The _center_loss_func really receives:
- features: previous layer to the softmax => X dimensions.
- labels: one-hot-encoding of your target, if you have 100 classes it will be the vector 0 ... 1 ... in the 1 in the class corresponding to that sample. For that reason if you take the position of the maximum value (value 1), you know what class it is.
- alpha: controls the speed that the centroid is updated.
- num_classes: number of classes you use (It would be equivalent to doing labels.get_shape () [1]).
The problem that i find in this kind of implementation is that in validation split the center will be move. I dont know if the are any flag/control to know if the loss function is in the validation step.
I did this in image scene classification, but the model has no performance boost compare with softmax loss.Looking forward to your reply!
I could be several reasons. I have done the following: I added a line of code to make sure that the center_loss was always with the current center (if i dont do that i have problems with l2norm).
def _center_loss_func(features, labels, alpha, num_classes, centers, feature_dim):
assert feature_dim == features.get_shape()[1]
labels = K.argmax(labels, axis=1)
labels = tf.to_int32(labels)
centers_batch = K.gather(centers, labels)
diff = (1 - alpha) * (centers_batch - features)
centers = tf.scatter_sub(centers, labels, diff)
centers_batch = K.gather(centers, labels)
loss = K.mean(K.square(features - centers_batch))
return loss
def get_center_loss(alpha, num_classes, feature_dim):
"""Center loss based on the paper "A Discriminative
Feature Learning Approach for Deep Face Recognition"
(http://ydwen.github.io/papers/WenECCV16.pdf)
"""
# Each output layer use one independed center: scope/centers
centers = K.zeros([num_classes, feature_dim], dtype='float32')
@functools.wraps(_center_loss_func)
def center_loss(y_true, y_pred):
return _center_loss_func(y_pred, y_true, alpha, num_classes, centers, feature_dim)
return center_loss
It may be due to different reasons perhaps you have to touch the importance it has regarding the softmax. That can be done with loss_weights (0.01 for me).
For my part I have added an l2_norm in the features, that way they will always be on the same scale. In addition to being proportional both the Euclidean distance and the Cosine distance (https://stats.stackexchange.com/questions/146221/is-cosine-similarity-identical-to-l2-normalized-euclidean-distance).
I could be several reasons. I have done the following: I added a line of code to make sure that the center_loss was always with the current center (if i dont do that i have problems with l2norm).
def _center_loss_func(features, labels, alpha, num_classes, centers, feature_dim): assert feature_dim == features.get_shape()[1] labels = K.argmax(labels, axis=1) labels = tf.to_int32(labels) centers_batch = K.gather(centers, labels) diff = (1 - alpha) * (centers_batch - features) centers = tf.scatter_sub(centers, labels, diff) centers_batch = K.gather(centers, labels) loss = K.mean(K.square(features - centers_batch)) return loss def get_center_loss(alpha, num_classes, feature_dim): """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" (http://ydwen.github.io/papers/WenECCV16.pdf) """ # Each output layer use one independed center: scope/centers centers = K.zeros([num_classes, feature_dim], dtype='float32') @functools.wraps(_center_loss_func) def center_loss(y_true, y_pred): return _center_loss_func(y_pred, y_true, alpha, num_classes, centers, feature_dim) return center_loss
It may be due to different reasons perhaps you have to touch the importance it has regarding the softmax. That can be done with loss_weights (0.01 for me).
For my part I have added an l2_norm in the features, that way they will always be on the same scale. In addition to being proportional both the Euclidean distance and the Cosine distance (https://stats.stackexchange.com/questions/146221/is-cosine-similarity-identical-to-l2-normalized-euclidean-distance).
I may have problems creating the VGG16 model. The feature that requires full connectivity layer output when calculating centerloss. My code is written like this, but I don't know where the error is, can you help me?
model = VGG16(input_tensor=image_input, include_top=False,weights='imagenet')
model.summary()
last_layer = model.layers[-1].output
x = Flatten(name='flatten')(last_layer)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(num_classes,activation = 'softmax',name='predication')(x)
custom_vgg_model = Model(inputs = image_input, outputs = x)
custom_vgg_model.summary()
for layer in custom_vgg_model.layers[:-1]:
layer.trainable = False
custom_vgg_model.layers[3].trainable
sgd = optimizers.SGD(lr=learn_Rate,decay=decay_Rate,momentum=0.9,nesterov=True)
total_loss = center_loss(alpha=0.5,lambda_c=0.01,num_classes=num_classes)
custom_vgg_model.compile(loss="categorical_crossentropy",optimizer= sgd,metrics=['accuracy'])
I could be several reasons. I have done the following: I added a line of code to make sure that the center_loss was always with the current center (if i dont do that i have problems with l2norm).
def _center_loss_func(features, labels, alpha, num_classes, centers, feature_dim): assert feature_dim == features.get_shape()[1] labels = K.argmax(labels, axis=1) labels = tf.to_int32(labels) centers_batch = K.gather(centers, labels) diff = (1 - alpha) * (centers_batch - features) centers = tf.scatter_sub(centers, labels, diff) centers_batch = K.gather(centers, labels) loss = K.mean(K.square(features - centers_batch)) return loss def get_center_loss(alpha, num_classes, feature_dim): """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" (http://ydwen.github.io/papers/WenECCV16.pdf) """ # Each output layer use one independed center: scope/centers centers = K.zeros([num_classes, feature_dim], dtype='float32') @functools.wraps(_center_loss_func) def center_loss(y_true, y_pred): return _center_loss_func(y_pred, y_true, alpha, num_classes, centers, feature_dim) return center_loss
It may be due to different reasons perhaps you have to touch the importance it has regarding the softmax. That can be done with loss_weights (0.01 for me). For my part I have added an l2_norm in the features, that way they will always be on the same scale. In addition to being proportional both the Euclidean distance and the Cosine distance (https://stats.stackexchange.com/questions/146221/is-cosine-similarity-identical-to-l2-normalized-euclidean-distance).
I may have problems creating the VGG16 model. The feature that requires full connectivity layer output when calculating centerloss. My code is written like this, but I don't know where the error is, can you help me?
model = VGG16(input_tensor=image_input, include_top=False,weights='imagenet') model.summary() last_layer = model.layers[-1].output x = Flatten(name='flatten')(last_layer) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(num_classes,activation = 'softmax',name='predication')(x) custom_vgg_model = Model(inputs = image_input, outputs = x) custom_vgg_model.summary() for layer in custom_vgg_model.layers[:-1]: layer.trainable = False custom_vgg_model.layers[3].trainable sgd = optimizers.SGD(lr=learn_Rate,decay=decay_Rate,momentum=0.9,nesterov=True) total_loss = center_loss(alpha=0.5,lambda_c=0.01,num_classes=num_classes) custom_vgg_model.compile(loss="categorical_crossentropy",optimizer= sgd,metrics=['accuracy'])
You must enter the center_loss within the losses, otherwise it will not take effect. You must also specify which layer you are interested in, which are your "feature vectors". It would be something similar to this (for my part I prefer to always add an l2_norm in the features to control the scale):
model = VGG16(input_tensor=image_input, include_top=False,weights='imagenet')
model.summary()
last_layer = model.layers[-1].output
x = Flatten(name='flatten')(last_layer)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
features = x
x = Dense(num_classes,activation = 'softmax',name='predication')(x)
custom_vgg_model = Model(inputs = image_input, outputs = [x, features])
custom_vgg_model.summary()
for layer in custom_vgg_model.layers[:-1]:
layer.trainable = False
custom_vgg_model.layers[3].trainable
sgd = optimizers.SGD(lr=learn_Rate,decay=decay_Rate,momentum=0.9,nesterov=True)
total_loss = center_loss(alpha=0.5,lambda_c=0.01,num_classes=num_classes)
custom_vgg_model.compile(loss={'predication': "categorical_crossentropy", 'fc2': total_loss},loss_weights={'fc2': 1, 'predication': 1},optimizer= sgd,metrics={'predication': 'accuracy'})
You should play a little with the weight of both losses to force or not the compression of the clusters.
When you send the data to perform the learning you will see that you have two outputs, do not worry, the second has no use whatsoever. The solution is to send the replicated Y that is: If you use fit(X, [Y, Y]) If you use generator yield Xdata, [Ydata, Ydata]
I hope it has been helpful for you
I could be several reasons. I have done the following: I added a line of code to make sure that the center_loss was always with the current center (if i dont do that i have problems with l2norm).
def _center_loss_func(features, labels, alpha, num_classes, centers, feature_dim): assert feature_dim == features.get_shape()[1] labels = K.argmax(labels, axis=1) labels = tf.to_int32(labels) centers_batch = K.gather(centers, labels) diff = (1 - alpha) * (centers_batch - features) centers = tf.scatter_sub(centers, labels, diff) centers_batch = K.gather(centers, labels) loss = K.mean(K.square(features - centers_batch)) return loss def get_center_loss(alpha, num_classes, feature_dim): """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" (http://ydwen.github.io/papers/WenECCV16.pdf) """ # Each output layer use one independed center: scope/centers centers = K.zeros([num_classes, feature_dim], dtype='float32') @functools.wraps(_center_loss_func) def center_loss(y_true, y_pred): return _center_loss_func(y_pred, y_true, alpha, num_classes, centers, feature_dim) return center_loss
It may be due to different reasons perhaps you have to touch the importance it has regarding the softmax. That can be done with loss_weights (0.01 for me). For my part I have added an l2_norm in the features, that way they will always be on the same scale. In addition to being proportional both the Euclidean distance and the Cosine distance (https://stats.stackexchange.com/questions/146221/is-cosine-similarity-identical-to-l2-normalized-euclidean-distance).
I may have problems creating the VGG16 model. The feature that requires full connectivity layer output when calculating centerloss. My code is written like this, but I don't know where the error is, can you help me?
model = VGG16(input_tensor=image_input, include_top=False,weights='imagenet') model.summary() last_layer = model.layers[-1].output x = Flatten(name='flatten')(last_layer) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(num_classes,activation = 'softmax',name='predication')(x) custom_vgg_model = Model(inputs = image_input, outputs = x) custom_vgg_model.summary() for layer in custom_vgg_model.layers[:-1]: layer.trainable = False custom_vgg_model.layers[3].trainable sgd = optimizers.SGD(lr=learn_Rate,decay=decay_Rate,momentum=0.9,nesterov=True) total_loss = center_loss(alpha=0.5,lambda_c=0.01,num_classes=num_classes) custom_vgg_model.compile(loss="categorical_crossentropy",optimizer= sgd,metrics=['accuracy'])
You must enter the center_loss within the losses, otherwise it will not take effect. You must also specify which layer you are interested in, which are your "feature vectors". It would be something similar to this (for my part I prefer to always add an l2_norm in the features to control the scale):
model = VGG16(input_tensor=image_input, include_top=False,weights='imagenet') model.summary() last_layer = model.layers[-1].output x = Flatten(name='flatten')(last_layer) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) features = x x = Dense(num_classes,activation = 'softmax',name='predication')(x) custom_vgg_model = Model(inputs = image_input, outputs = [x, features]) custom_vgg_model.summary() for layer in custom_vgg_model.layers[:-1]: layer.trainable = False custom_vgg_model.layers[3].trainable sgd = optimizers.SGD(lr=learn_Rate,decay=decay_Rate,momentum=0.9,nesterov=True) total_loss = center_loss(alpha=0.5,lambda_c=0.01,num_classes=num_classes) custom_vgg_model.compile(loss={'predication': "categorical_crossentropy", 'fc2': total_loss},loss_weights={'fc2': 1, 'predication': 1},optimizer= sgd,metrics={'predication': 'accuracy'})
You should play a little with the weight of both losses to force or not the compression of the clusters.
When you send the data to perform the learning you will see that you have two outputs, do not worry, the second has no use whatsoever. The solution is to send the replicated Y that is: If you use fit(X, [Y, Y]) If you use generator yield Xdata, [Ydata, Ydata]
I hope it has been helpful for you
I have a new problem when model.fit(),the error is that ValueError: The model expects 2 target arrays, but only received one array. Found: array with shape (25200, 45) Hope to receive your reply
model = VGG16(input_tensor=imageinput, include_top=True,weights='imagenet')
last_layer = model.get_layer('fc2').output
feature = last_layer
out = Dense(num_classes,activation = 'softmax',name='predictions')(last_layer)
custom_vgg_model = Model(inputs = image_input, outputs = [out,feature])
for layer in custom_vgg_model.layers[:-3]:
layer.trainable = False
custom_vgg_model.layers[3].trainable
sgd = optimizers.SGD(lr=lr,decay=decay,momentum=0.9,nesterov=True)
custom_vgg_model.compile(loss={'predictions': "categorical_crossentropy", 'fc2': "total_loss"},loss_weights={'fc2': 1, 'predictions': 1},optimizer= sgd,metrics={'predictions': 'accuracy'})
hist = custom_vgg_model.fit(x = X_train, y = y_train, batch_size=batch_Sizes, epochs=epoch_Times, verbose=1, validation_data=(X_test, y_test))
# losses.py
def get_center_loss(labels,features, alpha,lambda_c,num_classes):
len_features = features.get_shape()[1]
try:
with tf.variable_scope('v_center',reuse = True):
centers = tf.get_variable('centers', [num_classes, len_features], dtype=tf.float32,
initializer=tf.constant_initializer(0), trainable=False)
except:
with tf.variable_scope('v_center',reuse = False):
centers = tf.get_variable('centers', [num_classes, len_features], dtype=tf.float32,
initializer=tf.constant_initializer(0), trainable=False)
labels = tf.argmax(labels, axis=1)
labels = tf.to_int64(labels)
center_loss = tf.reduce_mean(tf.square(features - centers_batch))
diffs = (features[:, tf.newaxis] - centers_batch[tf.newaxis, :])
#
centers_update_op = tf.scatter_sub(centers, labels, diff) # diff is used to get updated centers.
with tf.control_dependencies([centers_update_op]):
# combo_loss = value_factor * center_loss + new_factor * git_loss
combo_loss = lambda_c * center_loss
return combo_loss
def total_loss(y_true,y_pred):
center_loss = get_center_loss(y_true,y_pred,alpha=0.5,lambda_c=0.01,num_classes=45)
return center_loss
hist = custom_vgg_model.fit(x = X_train, y = y_train, batch_size=batch_Sizes, epochs=epoch_Times, verbose=1, validation_data=(X_test, y_test))
When applying the central loss what you are really forcing is that the network returns 2 targets (applying this method).
first target of the network refers to the softmax (prediction) and the classification problem.
second target is your features (fc2) and the image retrieval problem that you really want to solve.
When you do .fit, you must send two "Y", because one will work on target 1 and the other on target 2.
What happens is that the center_loss requires to know to what class it belongs (to correct only this centroid). For this reason you have to duplicate y_train and y_test:
hist = custom_vgg_model.fit (x = X_train, y = [y_train, y_train], batch_size = batch_Sizes, epochs = epoch_Times, verbose = 1, validation_data = (X_test, [y_test, y_test]))
Once you want to generate the final model you only have to do:
final_model = Model (inputs = custom_vgg_model.inputs, outputs = custom_vgg_model.get_layer ('fc2'). output)
hist = custom_vgg_model.fit(x = X_train, y = y_train, batch_size=batch_Sizes, epochs=epoch_Times, verbose=1, validation_data=(X_test, y_test))
When applying the central loss what you are really forcing is that the network returns 2 targets (applying this method).
- first target of the network refers to the softmax (prediction) and the classification problem.
- second target is your features (fc2) and the image retrieval problem that you really want to solve.
When you do .fit, you must send two "Y", because one will work on target 1 and the other on target 2.
What happens is that the center_loss requires to know to what class it belongs (to correct only this centroid). For this reason you have to duplicate y_train and y_test:
hist = custom_vgg_model.fit (x = X_train, y = [y_train, y_train], batch_size = batch_Sizes, epochs = epoch_Times, verbose = 1, validation_data = (X_test, [y_test, y_test]))
Once you want to generate the final model you only have to do:
final_model = Model (inputs = custom_vgg_model.inputs, outputs = custom_vgg_model.get_layer ('fc2'). output)
I did what you said, but there was a new problem.
ValueError: X (images tensor) and y (labels) should have the same length. Found: X.shape = (6300, 224, 224, 3), y.shape = (2, 6300, 45)
In addition, the features of the fc2 layer output and y_train do not match. Y_train is the labels, and the center-loss target is near the center point, which has been updated in the losses.py center point.
# losses.py def get_center_loss(labels,features, alpha,lambda_c,num_classes): len_features = features.get_shape()[1] try: with tf.variable_scope('v_center',reuse = True): centers = tf.get_variable('centers', [num_classes, len_features], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) except: with tf.variable_scope('v_center',reuse = False): centers = tf.get_variable('centers', [num_classes, len_features], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) labels = tf.argmax(labels, axis=1) labels = tf.to_int64(labels) center_loss = tf.reduce_mean(tf.square(features - centers_batch)) diffs = (features[:, tf.newaxis] - centers_batch[tf.newaxis, :]) # update centers_update_op = tf.scatter_sub(centers, labels, diff) # diff is used to get updated centers. with tf.control_dependencies([centers_update_op]): # combo_loss = value_factor * center_loss + new_factor * git_loss combo_loss = lambda_c * center_loss
return combo_loss
def total_loss(y_true,y_pred): center_loss = get_center_loss(y_true,y_pred,alpha=0.5,lambda_c=0.01,num_classes=45) return center_loss
Test to send the "Y" in the adjustment like this:
y = {'fc2': y_train, 'predictions': y_train}
and in validation equal but with the y_test.
Is it possible that you are doing final_model.fit (...)? That would not be correct you should learn custom_vgg_model.fit (...) and then once you have done the learning you can convert it so that it has a single output.
Can you provide the code that you currently have?
Try using the following center loss to make sure the error is in other site.
# Center Loss
def _center_loss_func(features, labels, alpha, num_classes, centers, feature_dim):
assert feature_dim == features.get_shape()[1]
labels = K.argmax(labels, axis=1)
labels = tf.to_int32(labels)
centers_batch = K.gather(centers, labels)
diff = (1 - alpha) * (centers_batch - features)
centers = tf.scatter_sub(centers, labels, diff)
centers_batch = K.gather(centers, labels)
loss = K.mean(K.square(features - centers_batch))
return loss
def get_center_loss(alpha, num_classes, feature_dim):
"""Center loss based on the paper "A Discriminative
Feature Learning Approach for Deep Face Recognition"
(http://ydwen.github.io/papers/WenECCV16.pdf)
"""
# Each output layer use one independed center: scope/centers
centers = K.zeros([num_classes, feature_dim], dtype='float32')
@functools.wraps(_center_loss_func)
def center_loss(y_true, y_pred):
return _center_loss_func(y_pred, y_true, alpha, num_classes, centers, feature_dim)
return center_loss
Test to send the "Y" in the adjustment like this:
y = {'fc2': y_train, 'predictions': y_train}
and in validation equal but with the y_test.Is it possible that you are doing final_model.fit (...)? That would not be correct you should learn custom_vgg_model.fit (...) and then once you have done the learning you can convert it so that it has a single output.
Can you provide the code that you currently have?
Try using the following center loss to make sure the error is in other site.
# Center Loss def _center_loss_func(features, labels, alpha, num_classes, centers, feature_dim): assert feature_dim == features.get_shape()[1] labels = K.argmax(labels, axis=1) labels = tf.to_int32(labels) centers_batch = K.gather(centers, labels) diff = (1 - alpha) * (centers_batch - features) centers = tf.scatter_sub(centers, labels, diff) centers_batch = K.gather(centers, labels) loss = K.mean(K.square(features - centers_batch)) return loss def get_center_loss(alpha, num_classes, feature_dim): """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" (http://ydwen.github.io/papers/WenECCV16.pdf) """ # Each output layer use one independed center: scope/centers centers = K.zeros([num_classes, feature_dim], dtype='float32') @functools.wraps(_center_loss_func) def center_loss(y_true, y_pred): return _center_loss_func(y_pred, y_true, alpha, num_classes, centers, feature_dim) return center_loss
All the code are here.There are two questions 1、in custom_vgg_model.fit(y = {'fc2':y_train,'predictions':y_train}),'fc2':y_train have error that
ValueError: Error when checking target: expected fc2 to have shape (None, 4096) but got array with shape (6300, 45)
y_train is the labels. If I do like this custom_vgg_model.fit(y = {'fc2':dummy1,'predictions':y_train}),the model will train successful. The dummy1 have same shape with 'fc2' output(feature). dummy1 = np.zeros((y_train.shape[0],4096)) But can't improve the accuracy of the model.So it is wrong coding. 2、It is wrong to use ImageDataGenerator.flow(x = X_train, y = {'fc2':dummy1,'predictions':y_train}, batch_size=batch_Sizes).So I can't expand my data.
code
image_input = Input(shape=(224, 224, 3))
model = VGG16(input_tensor=image_input, include_top=True,weights='imagenet')
model.summary()
last_layer = model.get_layer('fc2').output
feature = last_layer
out = Dense(num_classes,activation = 'softmax',name='predictions')(last_layer)
custom_vgg_model = Model(inputs = image_input, outputs = [out,feature])
custom_vgg_model.summary()
for layer in custom_vgg_model.layers[:-3]:
layer.trainable = False
custom_vgg_model.layers[3].trainable
sgd = optimizers.SGD(lr=learn_Rate,decay=decay_Rate,momentum=0.9,nesterov=True)
center_loss = lossclass.get_center_loss(alpha=0.5, num_classes=45,feature_dim = 4096)
custom_vgg_model.compile(loss={'predictions': "categorical_crossentropy", 'fc2': center_loss},
loss_weights={'fc2': 1, 'predictions': 1},optimizer= sgd,
metrics={'predictions': 'accuracy'})
t=time.time()
dummy1 = np.zeros((y_train.shape[0],4096))
dummy2 = np.zeros((y_test.shape[0],4096))
if not data_Augmentation:
hist = custom_vgg_model.fit(x = X_train,y = {'fc2':y_train,'predictions':y_train},batch_size=batch_Sizes,
epochs=epoch_Times, verbose=1,validation_data=(X_test, {'fc2':y_test,'predictions':y_test}))
else:
datagen = ImageDataGenerator(
featurewise_center=False,
samplewise_center=False,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
rescale=None,
preprocessing_function=None,
data_format=None)
print('x_train.shape[0]:{:d}'.format(X_train.shape[0]))
hist = custom_vgg_model.fit_generator(datagen.flow(x = X_train, y = {'fc2':dummy1,'predictions':y_train}, batch_size=batch_Sizes),
steps_per_epoch=X_train.shape[0]/batch_Sizes,epochs=epoch_Times,
verbose=1, validation_data=(X_test, {'fc2':y_test,'predictions':y_test}))
# lossclass.py
def _center_loss_func(labels,features, alpha, num_classes, centers, feature_dim):
assert feature_dim == features.get_shape()[1]
labels = K.argmax(labels, axis=1)
labels = tf.to_int32(labels)
centers_batch = K.gather(centers, labels)
diff = (1 - alpha) * (centers_batch - features)
centers = tf.scatter_sub(centers, labels, diff)
centers_batch = K.gather(centers, labels)
loss = K.mean(K.square(features - centers_batch))
return loss
def get_center_loss(alpha, num_classes, feature_dim):
"""Center loss based on the paper "A Discriminative
Feature Learning Approach for Deep Face Recognition"
(http://ydwen.github.io/papers/WenECCV16.pdf)
"""
# Each output layer use one independed center: scope/centers
centers = K.zeros([num_classes, feature_dim], dtype='float32')
@functools.wraps(_center_loss_func)
def center_loss(y_true, y_pred):
return _center_loss_func(y_true, y_pred, alpha, num_classes, centers, feature_dim)
return center_loss
my center loss is getting Nan, how should I fix it?
how can I use center loss in keras?