triplt_loss - Githubissues

longzeyilang commented 6 years ago

Hi, I think there is some problem about your triplt_loss function Here is your function: __EPSILON = K.epsilon() def _loss_tensor(y_true, y_pred): y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON) loss = tf.convert_to_tensor(0,dtype=tf.float32) g = tf.constant(1.0, shape=[1], dtype=tf.float32) for i in range(0,batch_size,3): try: q_embedding = y_pred[i+0] p_embedding = y_pred[i+1] n_embedding = y_pred[i+2] D_q_p = K.sqrt(K.sum((q_embedding - p_embedding)2)) D_q_n = K.sqrt(K.sum((q_embedding - n_embedding)2)) loss = (loss + g + D_q_p - D_q_n )
except: continue loss = loss/(batchsize/3) zero = tf.constant(0.0, shape=[1], dtype=tf.float32) return tf.maximum(loss,zero) I think D_q_p - D_q_n may be less than 0，which cause total loss incorrect, it should be defined as follows: _def triplt_loss(y_pred): loss=tf.convert_to_tensor(0,dtype=tf.float32) total_loss=tf.convert_to_tensor(0,dtype=tf.float32) g=tf.constant(1.0,shape=[1],dtype=tf.float32) zero=tf.constant(0.0,shape=[1],dtype=tf.float32) for i in range(0,batch_size,3): try: q_embedding=y_pred[i] p_embedding=y_pred[i+1] n_embedding=y_pred[i+2] D_q_p=K.sqrt(K.sum((q_embedding-p_embedding)2)) D_q_n=K.sqrt(K.sum((q_embedding-n_embedding)2)) loss=tf.maximum(g+D_q_p-D_q_n,zero) total_loss=total_loss+loss except: continue total_loss=total_loss/(batch_size/3) return totalloss

akarshzingade commented 6 years ago

Hey, Longzeyilang! You are right! Dunno who I missed that. Thank you for pointing that! I am out of town for a few days, so won't be able to make changes for now. Could you please raise a PR for this once you test the code?

longzeyilang commented 6 years ago

@akarshzingade Thank you ! I still have some problem about object function of the original. you miss adding λ||w||2 to the object function, but I do not still known w, where is from?

akarshzingade commented 6 years ago

@longzeyilang That is the regularisation added to the loss function. 'W' is the weights of the whole network. (Formally, it is the parameters of the whole network as a function)

longzeyilang commented 6 years ago

@akarshzingade It means that you forget to add the regularisation to the loss function. or I add extra parameter like: first_conv = Conv2D(96, kernel_size=(8, 8),strides=(16,16), padding='same', kernel_regularizer=TruncatedNormal(stddev=0.01))(first_input)

or how to add it?

akarshzingade commented 6 years ago

I didn't forget. I was experimenting how the model works without regularisation back then. It performed pretty decently.

Ideally, you add it in the loss. But, in Keras, you would add it to the layers using the kernel_regularizer parameter like you've mentioned. As far as I remember, the authors use squared L2 norm for the regularization. Please check the paper once. I am still travelling, can't check it right now. Keras offers l2 regularisation.

longzeyilang commented 6 years ago

@akarshzingade Thank you! Have a nice trip

cesarandreslopez commented 6 years ago

@longzeyilang will you be making a pull request with your version of the loss function? I have found that yours produces better results. Thank you both!

longzeyilang commented 6 years ago

@cesarandreslopez please see above

BlueAnthony commented 5 years ago

Hi, @akarshzingade So, which one is the correct loss? Is "K.clip(y_pred, _EPSILON, 1.0-EPSILON)" necessary?

akarshzingade / image-similarity-deep-ranking

triplt_loss #17