Cheng-Lin-Li / SegCaps

A Clone version from Original SegCaps source code with enhancements on MS COCO dataset.
Apache License 2.0
65 stars 29 forks source link

Training Performance Do Not Improve #8

Open AuliaRizky opened 5 years ago

AuliaRizky commented 5 years ago

Does this result of training process that I got reasonable and should i proceed to the end of the epoch? It looks like the dice_hard do not improve and the optimizer has achieved local minima.

I use MRI dataset from ISLES 2017 and has adjusted the load data process without using K Fold.

Epoch 1/50 369/369 [==============================] - 1370s 4s/step - loss: 1.4192 - out_seg_loss: 1.2236 - out_recon_loss: 0.1956 - out_seg_dice_hard: 0.0746 - val_loss: 1.1187 - val_out_seg_loss: 1.0050 - val_out_recon_loss: 0.1137 - val_out_seg_dice_hard: 0.0304 Epoch 00001: val_out_seg_dice_hard improved from -inf to 0.03035, saving model to [my folder]

Epoch 2/50 369/369 [==============================] - 1371s 4s/step - loss: 1.1169 - out_seg_loss: 1.0563 - out_recon_loss: 0.0605 - out_seg_dice_hard: 0.0995 - val_loss: 1.0128 - val_out_seg_loss: 1.0010 - val_out_recon_loss: 0.0118 - val_out_seg_dice_hard: 0.0522 Epoch 00002: val_out_seg_dice_hard improved from 0.03035 to 0.05218, saving model to [my folder]

Epoch 3/50 369/369 [==============================] - 1365s 4s/step - loss: 1.0673 - out_seg_loss: 1.0443 - out_recon_loss: 0.0229 - out_seg_dice_hard: 0.1156 - val_loss: 1.0043 - val_out_seg_loss: 0.9994 - val_out_recon_loss: 0.0049 - val_out_seg_dice_hard: 0.0485 Epoch 00003: val_out_seg_dice_hard did not improve from 0.05218

Epoch 4/50 369/369 [==============================] - 1365s 4s/step - loss: 1.0133 - out_seg_loss: 0.9917 - out_recon_loss: 0.0216 - out_seg_dice_hard: 0.0873 - val_loss: 0.9998 - val_out_seg_loss: 0.9957 - val_out_recon_loss: 0.0041 - val_out_seg_dice_hard: 9.4607e-09 Epoch 00004: val_out_seg_dice_hard did not improve from 0.05218

Epoch 5/50 369/369 [==============================] - 1370s 4s/step - loss: 1.0076 - out_seg_loss: 0.9868 - out_recon_loss: 0.0207 - out_seg_dice_hard: 0.0623 - val_loss: 0.9991 - val_out_seg_loss: 0.9952 - val_out_recon_loss: 0.0039 - val_out_seg_dice_hard: 9.4697e-09 .....

Epoch 14/50 369/369 [==============================] - 1373s 4s/step - loss: 1.0047 - out_seg_loss: 0.9830 - out_recon_loss: 0.0217 - out_seg_dice_hard: 0.0644 - val_loss: 0.9982 - val_out_seg_loss: 0.9945 - val_out_recon_loss: 0.0038 - val_out_seg_dice_hard: 9.9502e-09 Epoch 00014: val_out_seg_dice_hard did not improve from 0.05218

Cheng-Lin-Li commented 5 years ago

Hi AuliaRizky,

You may need to try a single image to overfit the model first. If it works, then you know the model is complex enough for your task. Or you may need to make the model more powerful for your task.

With kind regards, Cheng-Lin Li

AuliaRizky commented 5 years ago

Hi @Cheng-Lin-Li Do you mean that I should feed only 1 image for training?

Thanks for your help

Update: Now I understand, i'll try it

Update 5 february: I've tried to test it with single image, the result shows the metric value at 0.2020, out_seg_loss = 0.878, out_recon_loss close to 0. The problem is that the model could not learn any further. It seems the model stop learning and tried to improve but constrained by the metric value (or loss) that do not changes.

Even when I tried to feed the ground truth image as the training set the result showed the same performance. And by adding more layer seems not change the results. Do you have any recommendation what part should I check?

Also can you explain what is the ConvCapsLayer do? I've read the original paper but I still not understand the implementation of the algorithm.

Thank you

Cheng-Lin-Li commented 5 years ago

Hi AuliaRizky,

  1. My implementation revises the loss_type from 'sorensen' to 'jaccard' in custom_losses.py, function: dice_soft. According to my previous discussion with the original author, he prefers 'sorensen' than 'jaccard'.
  2. You may try your own loss function to fit your data.
  3. From my understanding, the ConvCapsLayer tries to combine the advantage from both capsule and convolutional structure to make the capsule network can be deep (and more powerful/complex). You may consult the original author of the paper for more details.
AuliaRizky commented 5 years ago

Hi @Cheng-Lin-Li, Thank you for your response. I've managed to achieve overfitting for single image test. I found some preprocessing mistakes in image feeding. The pure output of segmented image (without otsu threshold) from testing shows good result. But, the value of the binary result that supposed to be 0 shows 0.475 (there is no 0 value there). The ROI show value higher than 0.6 that supposed to show value as 1.

And I think this is the problem when doing training process with all the dataset. The model do not output value that have significant different between pixel that suppose to be 1 or 0 . So that the model do not learn well.

Do you have any suggestion to make the output have significant distinction between the backgroung (0 region) and the ROI (1 region) ?