When I was training my models with tumor-stroma tissue segmentation data, I noticed that the dice score for background class during training looked like the following:
I assumed that the entire region within the ROI boxes is annotated and thus, none of the pixels within the ROI should be mapped to the "background" class. But upon closer investigation, I found many areas within the ROI boxes that are unannotated (An example slide: TCGA-D8-A1JL-01Z-00-DX1.FE3F0C6B-F98A-4036-BF9A-25A8CC66B1FD). So, quite a lot of tiles within the ROIs are actually shown to the model with the label of "background" which is bad.
Moreover, there seems to be some degree of overlap between the ROI and the regions considered as background in the tile. These may be:
Gaps between two polygons within the annotations which do not get assigned a class during one-hot-encoding
OR
Some edge artefacts that have happened due to human error while annotating.
These introduce noisy signals to the model while training. The ROI needs to be corrected appropriately.
When I was training my models with tumor-stroma tissue segmentation data, I noticed that the dice score for background class during training looked like the following:![image](https://user-images.githubusercontent.com/26798611/221015402-89984fc3-ed5a-420f-856f-df5e1644ad56.png)
I assumed that the entire region within the ROI boxes is annotated and thus, none of the pixels within the ROI should be mapped to the "background" class. But upon closer investigation, I found many areas within the ROI boxes that are unannotated (An example slide: TCGA-D8-A1JL-01Z-00-DX1.FE3F0C6B-F98A-4036-BF9A-25A8CC66B1FD). So, quite a lot of tiles within the ROIs are actually shown to the model with the label of "background" which is bad.
Moreover, there seems to be some degree of overlap between the ROI and the regions considered as background in the tile. These may be:
OR
These introduce noisy signals to the model while training. The ROI needs to be corrected appropriately.