keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.88k stars 19.45k forks source link

How to use a custom objective function for a model? #369

Closed log0 closed 9 years ago

log0 commented 9 years ago

New to Keras and DL, so I may be asking really basic questions... appreciate if someone could explain in an easier term. Thanks!

How to use a custom objective function for a model? Any sample code I could look into to reference from?

It seems models can only accept a string of the pre-defined objective functions here : https://github.com/fchollet/keras/blob/master/keras/objectives.py

However, the doc (http://keras.io/models/) sounds like as if I could just plug in a custom function like score(true, pred): loss: str (name of objective function) or objective function.

The source code also seems to say it's only string: https://github.com/fchollet/keras/blob/master/keras/models.py#L240

mthrok commented 9 years ago

You almost answer the question yourself. Create your objective function like ones in https://github.com/fchollet/keras/blob/master/keras/objectives.py like this,

import theano
import theano.tensor as T

epsilon = 1.0e-9
def custom_objective(y_true, y_pred):
    '''Just another crossentropy'''
    y_pred = T.clip(y_pred, epsilon, 1.0 - epsilon)
    y_pred /= y_pred.sum(axis=-1, keepdims=True)
    cce = T.nnet.categorical_crossentropy(y_pred, y_true)
    return cce

and pass it to compile argument

model.compile(loss=custom_objective, optimizer='adadelta')

M-Taha commented 9 years ago

I am trying to write a new objective function like the following:

def expert_loss(y_true, y_pred):
    # y_pred is n-dimensional, y_true is n+1 dimensional.
    return T.sqr(T.dot(y_true[:-1].T, y_pred) - y_true[-1]).mean(axis=-1)

In this objective function the dimensions of y_true and y_pred are different. Keras gives error for shape mismatch. Is this a bug? Is there a way around?

BrianMiner commented 8 years ago

@mthrok , this may be a naive question....but all you need to do is a create a loss function (although it looks like it is not simple if you are not familiar with Theano)? In back prop isnt the gradient needed?

RagMeh11 commented 8 years ago

Hi, can you tell me how to check the dimension of y_true and y_pred?? I want to define a objective function which is dependent on the dice coefficient instead of accuracy and as we are using it for segmentation

rgalhama commented 8 years ago

Hey, I also need to create a custom objective function, but in this case it should use the weights of the layer (instead of the output and prediction). Is there any workaround to access the weights? Thanks!

vadirajmkulkarni commented 8 years ago

@rgalhama To access the weights of any layer after compilation of the entire model you can use
weights = model.layer[id].get_weights() # list of numpy arrays

Not sure whether this will work during run-time though. If your goal is to add loss component based on weights of the neural network then you can check regularization option too.

gideonite commented 8 years ago

Is there a way of accessing the rest of the batch? I'm doing unsupervised learning and have a custom objective function which is " batch-wise" --- it depends on every example in the batch.

wddabc commented 8 years ago

Hi, It is great to abstract the objective function to take (y_pred, y_true) as input. I'm wondering whether there are clean ways to define the objective as a whole? I feel a little bit awkward as I have to explicitly decompose my objective to y_pred and y_true in order to suit the objective interface. In my case, I only need to pass a X into my model and directly optimize the objective, without considering y. A work around is define a function which ignores y_true and pass a 1-vector during training. But I'm wondering whether there are some better ways for that.

LeZhengThu commented 8 years ago

@mthrok If I want to write a custom objective function myself, where should I put the function, in the objectives.py of keras file, or directly in my python file? I tried your example code and put it in my python file, but an error occured. 'Exception: Invalid objective: custom_objective'

LeZhengThu commented 8 years ago

@mthrok I used compile(loss='custom_objective', ....) And if I used compile(loss= custom_objective, ...) just as your example, keras threw out an error: 'NameError: global name 'custom_objective' is not defined'

luthfianto commented 8 years ago

@LeZhengThu loss= custom_objective should be okay. Are you sure you aren't misplacing/typo your custom objective function?

hadi-ds commented 8 years ago

This is slightly off the topic, but how can I get shape of the theano tensor variables that go through a cost function?

I am trying to write a custom cost function for an auto-encoder I built. It is basically generalization of 'mean_squared_error' to this case where for each input vector the output is also a vector rather than a value. Mathematically, it is the following:

cost = sum_(i=1,...,N ) {||x_i - x'_i||^2} / N, where x_i is a vector. I modified mean_squared_error as follows:

def vector_mse(X_true, X_pred):
    from keras import backend as K
    n_featues = K.shape(X_true)[1]
    return n_features * K.mean(K.square(y_pred - y_true), aix=None) 
    # 'axis=None' makes 'mean' to go over all elements of the tensor,
    # so I need to multiply the result with n_features to get the mean as defined in equation above.

It fails on 'n_featues = K.shape(X_true)[1]' because K.shape(X_true) return 'Shape.0' instead of a tuple with number of rows/columns.

I am not very familiar with theano backend, but I tried evaluating it as 'K.eval(K.shape(X_true))' but it doesn't work. Another thing I tried was to convert the tensors to numpy array 'X_true_numpy = K.eval(X_true), and go from there, but it also fails with following error: raise MissingInputError("Undeclared input", variable) theano.gof.fg.MissingInputError: ('Undeclared input', sequential_2_target) as if X_true is not really assigned.

I appreciate any ideas on how to make this work.

shamidreza commented 7 years ago

I want to design a customized loss function in which we use the layer outputs in the loss function calculations. For a hypothetical example, lets consider a 3 layered DNN: x->h_1->h_2->y Let's consider that in addition to minimizing (y,y_pred) we want to minimize (h_1, h_2) (crazy hypothetical). In theano it is straightforward: cost=mse(y,y_pred)+mse(h_1,h_2) But with keras, how should I access h_1/h_2 so I could work on them?

One solution that comes to my mind is to create a new input/output pair, let's call them X/Y. Then define X=[y,h_1] and Y=[y_pred,h_2] by Merge(concatenate) them, and then build a new cost function that decouples the Merged symbols and compute mse on each of them. Is this something that might work? Thanks, Hamid

17shasvatj commented 7 years ago

@M-Taha Did you ever resolve this issue involving shape mismatch? I have the same problem.

curiale commented 7 years ago

@shamidreza Have you found a cool way to do it? (I mean, avoiding the merge option)

shamidreza commented 7 years ago

@curiale Unfortunately, no. I had to go back to my beloved theano for that.

curiale commented 7 years ago

@shamidreza Thanks. I think that you should open a new issue about this.

kobeee commented 7 years ago

@hadi-ds I met the same problem like yours, have you solved it?

hadi-ds commented 7 years ago

@kobeee I ended up writing that objective function in theano as follows:

def vector_mse(y_true, y_pred):
    from theano import tensor as T
    diff2 = (y_true - y_pred)**2
    return T.mean(T.sum(diff2, axis = -1))

hope this helps.

kobeee commented 7 years ago

@hadi-ds Thank you! I have solved the problem.

joetigger commented 7 years ago

I have a similar problem as @hadi-ds except that mine couldn't be easily solved with tensor functions. In my problem, the model output y_pred has shape (num_samples,2), where y_pred[:,0] is mean and y_pred[:,1] is sigma. Since keras doesn't have split, I followed the suggestion to write my objective function using Lambda:

def myloss(y_true, y_pred):
   mu = Lambda(lambda x:x[:,0], output_shape=input_shape[:-1]+(1,))(y_pred)
   sigma = Lambda(lambda x:x[:,1], output_shape=input_shape[:-1]+(1,))(y_pred)
   p = K.exp(-K.square((y_true-mu)/sigma)*0.5)/sigma/np.sqrt(2*np.pi) + K.epsilon()
   return K.mean(-K.log(p))

However, unlike Layer, custom loss function doesn't know the shape of tensor, and based on Issue 2801 it seems that keras doesn't support getting tensor shape or tensor split, so how can I implement my objective function?

hkmztrk commented 7 years ago

Hello,

I'm trying to define my own metric,

def cindex_score(y_true, y_pred):
    sum = 0
    pair = 0    
    for i in range(1, len(y_true)):
        for j in range(0, i):
            if i is not j:
                if(y_true[i] > y_true[j]):
                  pair +=1
                  sum +=  1* (y_pred[i] > y_pred[j]) + 0.5 * (y_pred[i] == y_pred[j])
    if pair is not 0:
        return sum/pair
    else:
        return 0

But I get the following error, "object of type 'Tensor' has no len()". I know Tensor object does not have len attribute, but shape attribute does not work as well.

For instance, y_true is represented as such Tensor("dense_4_target:0", shape=(?, ?), dtype=float32) and its shape is Tensor("strided_slice:0", shape=(), dtype=int32).

Could you please help me about how to turn the code above into runnable form?

danmoller commented 7 years ago

@joetigger , I managed to work your problem around (but I'm not sure if your Lambdas will work ok, my case was solved with this). It's probably not the most expected solution, but it's the only thing I could do so far, and it works great :D

I needed the loss function to carry the result of some keras layers, so, I created those layers as an independent model and appended those to the end of the model. The idea was to train the model using the already processed output instead of the original output:

Creating the model for calculating the loss:

#it will be used also for processing your training and validation outputs

def createLossModel():

    inLay = Input(#shape of your original output)

    #use your lambdas here.....
    lay = AnyKerasLayer(params...)(inLay)
    lay = AnyOtherKerasLayer(params...)(lay)
    ...
    output = OneMoreLayer(params...)(lay)

    m = Model(inLay,output)

    #it's important to make this model not trainable if it has weights 
        #(you should probably set these weights manually if that's the case)    
    m.trainable = False
    for l in m.layers: l.trainable = False

    return m

Now, let's create a function to join this model to the original model:

def appendLossModel(lossModel,appendToModel, loss = None):

    #create input layers that match your original model
    inLay = Input(#shape of your original input)
    origOut = appendToModel(inLay)
    lossOut = lossModel(origOut)

    m = Model(inLay,lossOut)

    #this is the model you're going to train, so it needs to be compiled
    if not loss is None:
        m.compile(optimizer='adam',loss=loss)

    return m

Now let's manage preparing our models for training:

originalModel = createYourOriginalModelHere() #not necessary to compile
lossModel = createLossModel() #not necessary to compile    

#joining models:
#here you can use the rest of your "myloss", the part with K.exp(-K.square((y_true-m.....
trainingModel = appendLossModel(lossModel, originalModel, loss=myloss) 
    #compiled inside the append function    

Training - Important: you're going to compare with different results now, so, process your training Y:

newY = lossModel.predict(originalY)
newValidationY = lossModel.predict(originalValidationY)

#there we go:
trainingModel.fit(originalX, newY, ......... [originalValidationX,newValidationY] .......)

And finally, for the results:

results =  originalModel.predict(originalX)

I hope it helps :D

joetigger commented 7 years ago

@danmoller Thanks for the tip! Yes, your solution would help solve my problem. :+1:

InderpreetSinghChhabra01 commented 7 years ago

Hii, I need to define my own loss function, I am using GAN model and my loss will include both adverserial loss and L1 loss between true and generated images, I tried to write a function but the folloeing error.

ValueError: ('Could not interpret loss function identifier:', Elemwise{add,no_inplace}.0)

my loss function is

def loss_function (y_true, y_pred, y_true1, y_pred1): bce = 0, batch_size=64

for i in range (64):
    a = y_pred1[i]        
    b = y_true1[i]        
    x = K.log(a)
    bce=bce-x
bce/=64
print('bce = ', bce)

for i in zip( y_pred, y_true):
    img   = i[0]
    image = np.zeros((64,64),dtype=y_pred.dtype)
    image = img[0,:,:]                
    image = image*127.5+127.5                
    imgfinal = Image.fromarray(image.astype(np.uint8))

    img1 = i[1]
    image1 = np.zeros((64,64), dtype=y_true.dtype)
    image1 = img1[0,:,:]
    image1 = image1*127.5+127.5              
    imgfinal1 = Image.fromarray(image1.astype(np.uint8))

    diff = ImageChops.difference(imgfinal,imgfinal1)

    h = diff.histogram()
    sq = (value*((idx%256)**2) for idx, value in enumerate(h))
    #sq = (value*(idx**2) for idx, value in enumerate(h))
    sum_of_squares = sum(sq)
    lossr = math.sqrt(sum_of_squares/float(im1.size[0] * im1.size[1]))
    loss  = loss+lossr

loss /=(64*127) 
print('loss = ', loss)

return x+loss

Thanx in advance

gaopinghai commented 7 years ago

@hkmztrk Have you solved this problem? It bothers me too!

hkmztrk commented 7 years ago

Hey @PingHGao, sorry for my late reply! Yeah, check this out! https://stackoverflow.com/questions/43576922/keras-custom-metric-iteration

sajjo79 commented 6 years ago

Hi, I am trying to write a loss function that include statements like this

 tmp=tf.zeros([20,5,240,240])
    ind=tf.where(y_true==1) # y_true is also same shape tensor as that of tmp
    tmp=tf.assign(tmp[ind],1)

but it is giving me error. How can i do this.

hgaiser commented 6 years ago

tf.equal

mverzett commented 6 years ago

Dear experts,

I've been trying to find the answer online without success: what are the expected output shape of the tensor for custom loss? Is it a single scalar value or a vector?

Thank you

Edit: I also noticed the following inconsistency in Keras 1 (I know, old set-up)

>>> import numpy as np
>>> from keras import losses
>>> import tensorflow as tf
>>> from keras import backend as K
>>> s = tf.Session()
>>> x = np.random.rand(10,1)
>>> y = (np.random.rand(10,1) > 0.5).astype(float)
>>> losses.binary_crossentropy(tf.convert_to_tensor(y), tf.convert_to_tensor(x)).eval(session=s)
array([ 2.78712478,  1.3038832 ,  1.34750736,  1.20378335,  0.8925575 ,
        1.65690233,  0.54149211,  1.28658053,  0.30892665,  0.68163989])
>>> K.binary_crossentropy(tf.convert_to_tensor(y), tf.convert_to_tensor(x)).eval(session=s)
array([[ 15.1252521 ],
       [ 11.7424268 ],
       [ 11.92920796],
       [ 11.28175078],
       [  9.51601343],
       [ 13.04390931],
       [  6.7393083 ],
       [ 11.66605725],
       [  4.28363182],
       [  7.96577442]])

Can somebody explain me the difference in the behaviour and, most importantly, the reson for the different values?

Again, thanks!

immuno121 commented 6 years ago

@mverzett , I think there is some implementation level difference. If we look at the documentation of both,

  1. For losses.binary_crossentropy(y_true,y_pred), on line 76- they have called the same function i.e. K.binary_crossentropy but have taken the mean of the output along the last axis https://github.com/keras-team/keras/blob/master/keras/losses.py#L76 Here is the link to K.binary_crossentropy https://github.com/tensorflow/tensorflow/blob/r1.7/tensorflow/python/keras/_impl/keras/backend.py#L3413

I do not understand why this should make a difference in your case as the last axis had only a single dimension though.

emnajaoua commented 6 years ago

I am having trouble converting this function to keras in order to calculate a custom loss. def detect_blocks(x): outputs=[] for i, row in enumerate(x): last_ele=row[0] for j, val in enumerate(row[1:]): if val == last_ele: continue outputs.append([i,j, last_ele]) last_ele=val outputs.append([i,len(row)-1, last_ele]) return outputs and this function as well:

def calculate_accuracy(l1,l2): #should be rescaled !!! acc = 0 cmp = 0 j = 0 i = 0 len_l1 = len(l1) len_l2 = len(l2) initial_length = len_l1 while i < len_l1: while j < len_l2: if np.array_equal(l1[i], l2[j]): cmp += 1 l1.remove(l1[i]) l2.remove(l2[j]) len_l1 = len(l1) len_l2 = len(l2) elif abs(l1[i][2] - l2[j][2]) < neighborhood_constant: if (l1[i][0] == l2[j][0]) and (l1[i][1] == l2[j][1]): cmp += 1 l1.remove(l1[i]) l2.remove(l2[j]) len_l1 = len(l1) len_l2 = len(l2) j = 0 j += 1 i += 1 acc = cmp/initial_length return acc and I want to call those functions to calculate the loss using this code

`#additional loss

x_test_normalized = tf.round(36 inputs) x_decoded_normalized = tf.round(36 outputs) acc_rectangles = 0

elem_1 = tf.map_fn(lambda x: (x), x_test_normalized, dtype=(tf.float32))

elem_2 = tf.map_fn(lambda x: (x), x_decoded_normalized, dtype=(tf.float32))

rectanglesL1 = detect_blocks(tf.reshape(x_test_normalized, [6,6])) rectanglesL2 = detect_blocks(tf.reshape(x_decoded_normalized,[6,6])) acc = calculate_accuracy(rectanglesL1,rectanglesL2) acc_rectangles += acc additional_loss= acc_rectangles/len(inputs)`

So Basically, I am using a VAE where I want to integrate this additional loss as well. but I get this error `---> 14 rectanglesL1 = detect_blocks(tf.reshape(x_test_normalized, [6,6])) 15 rectanglesL2 = detect_blocks(tf.reshape(x_decoded_normalized,[6,6])) 16 acc = calculate_accuracy(rectanglesL1,rectanglesL2)

in detect_blocks(x) 1 def detect_blocks(x): 2 outputs=[] ----> 3 for i, row in enumerate(x): 4 last_ele=row[0] 5 for j, val in enumerate(row[1:]): /opt/aiml4it/anaconda/3-5.2.0-generic/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in __iter__(self) 434 if not context.executing_eagerly(): 435 raise TypeError( --> 436 "Tensor objects are not iterable when eager execution is not " 437 "enabled. To iterate over this tensor use tf.map_fn.") 438 shape = self._shape_tuple() TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn.`
MingleiLI commented 5 years ago

You almost answer the question yourself. Create your objective function like ones in https://github.com/fchollet/keras/blob/master/keras/objectives.py like this,

import theano
import theano.tensor as T

epsilon = 1.0e-9
def custom_objective(y_true, y_pred):
    '''Just another crossentropy'''
    y_pred = T.clip(y_pred, epsilon, 1.0 - epsilon)
    y_pred /= y_pred.sum(axis=-1, keepdims=True)
    cce = T.nnet.categorical_crossentropy(y_pred, y_true)
    return cce

and pass it to compile argument

model.compile(loss=custom_objective, optimizer='adadelta')

@mthrok What is the requirement on the input y_pred and y_true? Should they come from logits or from softmax that performs as a probability just as https://github.com/tensorflow/tensorflow/blob/r1.11/tensorflow/python/keras/backend.py?

MatthiasWinkelmann commented 5 years ago

@emnajaoua

Your question (and, tbh, all the others) are better suited for stack overflow, considering they aren't bugs in keras but questions regarding its use.

You are also more likely to get useful answers by investing a modicum of time to at least properly format your post. It's currently lacking line breaks and respect for the reader.

That being said, I'll give you the hint that Tensorflow was giving you a hint:

TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn.`

That's a pretty useful error message. Did you try tf.map_fn? Because it would do what you're trying to do.

The larger problem, however, is that you haven't grokked how tensorflow actually works. https://www.tensorflow.org/guide/graphs might be a good introduction. Specifically, getting a tensor's length does not make much sense because a tensor's length/shape is fixed at compile time.

The process of implementing a custom loss isn't fundamentally different from doing so in standard python. It's just that you cannot manipulate tensors with the python stdlib you are used to. See the tensorflow documentation for the operations that are defined on tensors (such as tf.mean()). That's the toolbox you are working with. It also helps to read some of the keras source code to get a feel of how such things are done.