Closed henry0312 closed 7 years ago
If you want to use a loss function that is not of the form of f(x_true, x_pred)
, then you have to implement your training routine outside of Keras.
Basically:
1) define your model (typically using the functional API)
2) define your custom cost
3) instantiate an optimizer, get weights updates via:
updates = optimizer.get_updates(model.trainable_weights, model.constraints, cost))
4) take care manually of regularizers and batchnorm updates
5) create your own Keras functions based on the inputs, outputs, and updates
thank you for your quick reply. I'll try (3), (4) and (5).
3) instantiate an optimizer, get weights updates via: updates = optimizer.get_updates(model.trainable_weights, model.constraints, cost))
5) create your own Keras functions based on the inputs, outputs, and updates
I found that I should do like https://github.com/fchollet/keras/blob/master/keras/engine/training.py#L649-670.
4) take care manually of regularizers and batchnorm updates
I can't understand what do you mean. Please give me some examples.
If you don't have regularizers or batchnorm layers you can ignore this. Otherwise, you need to:
compile
method for details.model.updates
to your optimizer-generated updates. Again, compile
covers everything you need to know.Thank you! I'll check compile
@henry0312 Have you figured it out? I'm currently facing the same problem and haven't found out a way to implement a cost function that takes multiple arguments (with different shapes) as inputs. Could you please provide an example about how you did that?
@ScartleRoy No, it's difficult for me to achieve this with fit
or fit_generator
😔
I'll try more.
@henry0312 Maybe we need some help from additional official documents. I think there should be many people that also have needs to design cost functions that are different from the default ones in Keras. @fchollet Sorry to disturb you again but would it be possible for Keras to provide some documentations about how to write custom loss functions with different forms?
Any update to this?
(Copying the answer which I posted on Stackoverflow) http://stackoverflow.com/questions/33859864/how-to-create-custom-objective-function-in-keras/40622302#40622302
Here is my small snippet to write new loss functions and test them before using: `.
import numpy as np
from keras import objectives
from keras import backend as K
_EPSILON = K.epsilon()
def _loss_tensor(y_true, y_pred):
y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = -(y_true * K.log(y_pred) + (1.0 - y_true) * K.log(1.0 - y_pred))
return K.mean(out, axis=-1)
def _loss_np(y_true, y_pred):
y_pred = np.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = -(y_true * np.log(y_pred) + (1.0 - y_true) * np.log(1.0 - y_pred))
return np.mean(out, axis=-1)
def check_loss(_shape):
if _shape == '2d':
shape = (6, 7)
elif _shape == '3d':
shape = (5, 6, 7)
elif _shape == '4d':
shape = (8, 5, 6, 7)
elif _shape == '5d':
shape = (9, 8, 5, 6, 7)
y_a = np.random.random(shape)
y_b = np.random.random(shape)
out1 = K.eval(_loss_tensor(K.variable(y_a), K.variable(y_b)))
out2 = _loss_np(y_a, y_b)
assert out1.shape == out2.shape
assert out1.shape == shape[:-1]
print np.linalg.norm(out1)
print np.linalg.norm(out2)
print np.linalg.norm(out1-out2)
def test_loss():
shape_list = ['2d', '3d', '4d', '5d']
for _shape in shape_list:
check_loss(_shape)
print '======================'
if __name__ == '__main__':
test_loss()`
Here as you can see I am testing the binary_crossentropy loss, and have 2 separate losses defined, one numpy version (_loss_np) another tensor version (_loss_tensor) [Note: if you just use the keras functions then it will work with both Theano and Tensorflow... but if you are depending on one of them you can also reference them by K.theano.tensor.function or K.tf.function]
Later I am comparing the output shapes and the L2 norm of the outputs (which should be almost equal) and the L2 norm of the difference (which should be towards 0)
Once you are satisfied that your loss function is working properly you can use it as:
model.compile(loss=_loss_tensor, optimizer=sgd)
@indraforyou thanks for the snippet. Your function gives loss values, but how can we specify the gradients from custom loss function for backpropagation?
Edit: I got the answer from keras-users group. Thanks "Klemen Grm" https://groups.google.com/forum/#!searchin/keras-users/loss$20gradients|sort:relevance/keras-users/9KHTdpQ_Rno/0p3tH_-FEgAJ
If you look at the source file for builtin objective functions ( https://github.com/fchollet/keras/blob/master/keras/objectives.py ), notice they're all implemented as Theano functions, which enables automatic gradient calculation. This must also be the case for any custom objective function you implement yourself.
@janiteja : Yes that one of the benefit of using Theano/Tensorflow and libraries build on top of them. They can give you automatic gradient calculation of the mathematical functions and operations.
Keras gets them by calling: `
# keras/theano_backend.py
def gradients(loss, variables):
return T.grad(loss, variables)
# keras/tensorflow_backend.py
def gradients(loss, variables):
'''Returns the gradients of `variables` (list of tensor variables)
with regard to `loss`.
'''
return tf.gradients(loss, variables, colocate_gradients_with_ops=True)`
which are in turn called by the optimizers(keras/optimizers.py) to get the update rule for the tensor graph.
The only time you need to write new gradient is when you are defining a new basic mathematical operation/function .. you can see the below links for that: http://deeplearning.net/software/theano/extending/extending_theano.html https://www.tensorflow.org/versions/r0.12/how_tos/adding_an_op/index.html
@henry0312 Have you figured out how to code this? If you have, can you pls post some code snippet. I have similar need for a custom loss that takes three inputs.
There is an example that covers multiple additional arguments in image_ocr.py; link to a relevant part of it (actual loss function is defined here though)
For anyone else who arrives here by searching for "keras ranknet", you don't need to use a custom loss function to implement RankNet in Keras. The cost function as described in the paper is simply the binary cross entropy where the predicted probability is the probability that the more relevant document will be ranked higher than the less relevant document. The "trick" for implementing RankNet in Keras is making the input to the final sigmoid layer (which generates the predicted probability) the difference between the scores of the two documents (scores that are generated by the same net). My (slightly modified) Keras implementation of RankNet can be found here.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
I read through the available loss functions in Keras:
https://github.com/fchollet/keras/blob/master/keras/losses.py
But I am not sure how to extend the loss function, for example adding a regularization or adding more element?
I checked the model compile definition as below:
https://github.com/fchollet/keras/blob/master/keras/models.py#L742
But now sure how the keras handle the loss function....
Could anyone provide further references?
Here is what I am thinking of
def myloss(y_true, y_pred, weights): return K.mean(K.square(y_pred - y_true), axis=-1) + K.sum(l2 * K.square(W))
@xiaoleihuang Did you find a solution for your custom loss implementation. I am trying to implement a similar custom loss and I am not sure how to implement it.
Hi all, I have been reading through this and other similar issues and still haven't been able to implement my custom loss function.
What I have is a multilabel problem, with 4 input time series, and 7 possible labels at each time step. To attempt to solve the problem, I stacked a couple of LSTM layers followed by a TimeDistributed(Dense) layer, so that there is a classification for each time step. Input dimensions: (timesteps=200,features=4) Outputs dimensions: (timesteps=200,n_outputs=7)
I want to implement the loss function used in this article, where the loss is a convex combination of the final loss (time step = 200) and the average of the losses over all steps. I've tried quite a few approaches, but none have worked:
Essentially I would like something like this to work:
def custom_loss(y_true, y_pred):
alpha = 0.1
loss1 = K.sum(C.binary_cross_entropy(y_pred,y_true))/200
loss2 = C.binary_cross_entropy(y_pred[200,:],y_true[200,:])
loss = alpha*loss1 + (1-alpha)*loss2
return loss
I appreciate any help I can get.
Like @airalcorn2 noted above RankNet can be implemented w/ vanilla binary cross entropy in Keras. See my example: http://www.eggie5.com/130-learning-to-rank-siamese-network-pairwise-data
@indraforyou @fchollet in the loss, you take K.mean on axis=-1. This mean is taken across what?
@indraforyou @fchollet in the loss, you take K.mean on axis=-1. This mean is taken across what?
It means you sum over the last axis
@xiaoleihuang Did you solve your issue? I have the same issue with you.
@indraforyou thanks for the snippet. what about losses.categorical_crossentropy in order to do the multiple classification rather than binary crossentropy. Thanks.
This issue is the only thing that is keeping me from switching from low-level tensorflow to keras. But I see that upcoming tf versions will move to keras anyway, so I'm looking for a way to implement complex models without keras assuming I'm doing plain classification (like wgan, or Gans that uses other networks in their loss, etc).
I found this article that could be helpful to some of you: https://towardsdatascience.com/advanced-keras-constructing-complex-custom-losses-and-metrics-c07ca130a618
Basically you wrap your loss(true, pred) function in a function with an arbitrary number of parameters (which I suppose they may be tensors or whatever you want) and return a loss with the required signature.
I can't understand why keras focuses their api only on "classical" classification problems: Just think of a WGAN discriminator which has an unbounded output and you try to maximize that. In the standard gan formulation the target would be either zero or one, but what would be the target in the wgan case?
I'm sure that there will be a workaround, but I cannot see why complicating things only to have a train "black box" function with a simple signature
I'm going to implement RankNet, and I find I'll need my own loss function (cf. eq. (1) in the paper).
The loss function takes two arguments (not either y_true or y_pred, http://keras.io/objectives/) and returns one output. How do I implement it? Can I do it in the first place?
References
http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf
(cf. eq (1))
http://research.microsoft.com/pubs/132652/MSR-TR-2010-82.pdf
(cf. section 2)