Closed MicheleCancilla closed 3 years ago
In some cases the metric on training can be lower than validation. Usually it happens when performing strong data augmentation or noise in training set. It could be interesting for instance to calculate also the Dice on original training set after the training as well to see what's happening.
Also remember that the metric evaluation while you are training is an average for all the batches and normally the first batches are very low, then the metric while training use to be pessimistic. However the metric when you run the evaluation over validation dataset is using a better model and without any data augmentation.
@MicheleCancilla please can you run this?:
Thanks
Describe the bug The Dice metric always shows poor results on training set. On the other hand, the dice computed by use_case_pipeline gives good results on the validation set.
To Reproduce Steps to reproduce the behavior:
MS_SEGMENTATION_TRAINING
script.Expected behavior The Dice on training should be greater than or equal to the one on validation.