questions on the loss value and dice_coeff during the training process

EdwardTyantov / ultrasound-nerve-segmentation

Kaggle Ultrasound Nerve Segmentation competition [Keras]

181 stars 61 forks source link

questions on the loss value and dice_coeff during the training process #8

Closed wenouyang closed 7 years ago

wenouyang commented 7 years ago

Hi Edward,

During the training process for some data set, I can find the loss value computed for each epoch is negative, and it keeps decreasing along with different epochs; while dice_coef keeps increasing. Is this the right trend. For instance

Epoch 5/50 5250/5250 [==============================] - 284s - loss: -0.7539 - dice_coef: 0.7539 Epoch 6/50 5250/5250 [==============================] - 283s - loss: -0.7808 - dice_coef: 0.7808 Epoch 7/50 5250/5250 [==============================] - 283s - loss: -0.7891 - dice_coef: 0.7891 Epoch 8/50 5250/5250 [==============================] - 283s - loss: -0.8074 - dice_coef: 0.8074 Epoch 9/50

At the same time, for some other data sets, I can see the loss value is of positive. I am kind of curious why the loss value gets negative for some data set, while loss value gets positive for some other data sets. What's the best way to evaluate whether the training process goes well based on the envolving process of loss and dice_coef for each epoch.

Thank you very much.

ouyang

EdwardTyantov commented 7 years ago

Hi, ouyang. The answer is simple - loss function = - dice coeff. Motivation - gradient descent minimizes a loss function. You can see for more details in metric.py: def dice_coef_loss(y_true, y_pred): return -dice_coef(y_true, y_pred)

Dice coefficient cannot be negative by definition.

wenouyang commented 7 years ago

Hi Edward, thanks a lot. I missed that part.

I have a follow-up question for the same training process. In the prediction phase, I have 600 images. Instead of grouping them into a single npy file as the program does, I run the prediction process for each single image by iterating the image folders. The prediction code is the same imgs_mask_predict = model.predict(imgs,verbose=1) Hereimgs is of shape (1,1,128,128) representing a single image. After each single prediction step, I just print out the type of imgs_mask_predict using print(imgs_mask_predict.dtype). However, I noticed that for some predicted image, the type is float32 while for some predicted images, the type is float64. I am not very clear why having this kind of difference?

EdwardTyantov commented 7 years ago

Some magic involved, obv) No idea what's happened. Are all input arrays in the same format ? I used only built-in layers, so you should redirect this question to the keras github.

wenouyang commented 7 years ago

Thanks.