hamidriasat / BASNet

Code for Boundary-Aware Segmentation Network for Mobile and Web Applications
MIT License
20 stars 1 forks source link

Why do you monitor val_activation_46_mae in the training? #2

Closed murdav closed 1 year ago

murdav commented 1 year ago

Dear @hamidriasat,

Many thanks for this great project and for the Keras.io tutorial.

Why did you monitor the val_activation_46_mae [1] instead of val_activation_53_mae? In the paper, I read:

Therefore, given a input image, our predict module produces seven saliency maps in the training process. Although every saliency map is upsampled to the same size with the input image, the last one has the highest accuracy and hence is taken as the final output of the predict module. This output is passed to the refinement module.

Would it be better to monitor val_activation_53_mae or val_activation_53_loss?

Also, why not the val_loss?

Thank you once again.

D

[1] From https://github.com/hamidriasat/BASNet/blob/basnet_keras/basnet_training.ipynb

checkpoint_callback = keras.callbacks.ModelCheckpoint(
    filepath=WEIGHTS_PATH,
    **monitor=f'val_{basnet_model.output_names[0]}_mae',**
    mode='min',
    save_weights_only=True,
    save_best_only=True,
    verbose=1
)
hamidriasat commented 1 year ago

Dear @murdav

Thanks for appreciating my work.

Regarding monitoring val_activation_46_mae. Kindly understand the model has two parts prediction part and refinement part. Above you are giving paper reference to the prediction part but comparing it to the final model output which is coming from the refinement part. In my implementation of prediction module all auxiliary outputs are maintained in reverse order (Implementation requirement). Starting from the last output to first output of the prediction module. And then later this first output (which is last output of prediction module if you look at the architecture diagram) is passed to refinement module and when final model is created by combining both then at first place I placed refinement module output as described in the paper. So the order is correct.

Regarding your second concern why not val_loss. In paper, they did not use a validation set at all. So using it and then which function should be used to monitor is my personal choice. In the paper after training they evaluated results against multiple loss functions and one of them was Mean absolute error. I used it here so I do not have to measure validation accuracy separately.

Hope this will help to understand :smiley:, feel free to ask anything if still not clear.

murdav commented 1 year ago

@hamidriasat, thank you for your prompt and precise answer. I really appreciate it.

D