Open longzeyilang opened 6 years ago
Hey, Longzeyilang! You are right! Dunno who I missed that. Thank you for pointing that! I am out of town for a few days, so won't be able to make changes for now. Could you please raise a PR for this once you test the code?
@akarshzingade Thank you ! I still have some problem about object function of the original. you miss adding λ||w||2 to the object function, but I do not still known w, where is from?
@longzeyilang That is the regularisation added to the loss function. 'W' is the weights of the whole network. (Formally, it is the parameters of the whole network as a function)
@akarshzingade It means that you forget to add the regularisation to the loss function. or I add extra parameter like:
first_conv = Conv2D(96, kernel_size=(8, 8),strides=(16,16), padding='same', kernel_regularizer=TruncatedNormal(stddev=0.01))(first_input)
or how to add it?
I didn't forget. I was experimenting how the model works without regularisation back then. It performed pretty decently.
Ideally, you add it in the loss. But, in Keras, you would add it to the layers using the kernel_regularizer parameter like you've mentioned. As far as I remember, the authors use squared L2 norm for the regularization. Please check the paper once. I am still travelling, can't check it right now. Keras offers l2 regularisation.
@akarshzingade Thank you! Have a nice trip
@longzeyilang will you be making a pull request with your version of the loss function? I have found that yours produces better results. Thank you both!
@cesarandreslopez please see above
Hi, @akarshzingade So, which one is the correct loss? Is "K.clip(y_pred, _EPSILON, 1.0-EPSILON)" necessary?
Hi, I think there is some problem about your triplt_loss function Here is your function: __EPSILON = K.epsilon() def _loss_tensor(y_true, y_pred): y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON) loss = tf.convert_to_tensor(0,dtype=tf.float32) g = tf.constant(1.0, shape=[1], dtype=tf.float32) for i in range(0,batch_size,3): try: q_embedding = y_pred[i+0] p_embedding = y_pred[i+1] n_embedding = y_pred[i+2] D_q_p = K.sqrt(K.sum((q_embedding - p_embedding)2)) D_q_n = K.sqrt(K.sum((q_embedding - n_embedding)2)) loss = (loss + g + D_q_p - D_q_n )
except: continue loss = loss/(batchsize/3) zero = tf.constant(0.0, shape=[1], dtype=tf.float32) return tf.maximum(loss,zero) I think D_q_p - D_q_n may be less than 0,which cause total loss incorrect, it should be defined as follows: _def triplt_loss(y_pred): loss=tf.convert_to_tensor(0,dtype=tf.float32) total_loss=tf.convert_to_tensor(0,dtype=tf.float32) g=tf.constant(1.0,shape=[1],dtype=tf.float32) zero=tf.constant(0.0,shape=[1],dtype=tf.float32) for i in range(0,batch_size,3): try: q_embedding=y_pred[i] p_embedding=y_pred[i+1] n_embedding=y_pred[i+2] D_q_p=K.sqrt(K.sum((q_embedding-p_embedding)2)) D_q_n=K.sqrt(K.sum((q_embedding-n_embedding)2)) loss=tf.maximum(g+D_q_p-D_q_n,zero) total_loss=total_loss+loss except: continue total_loss=total_loss/(batch_size/3) return totalloss