nyukat / GMIC

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization
https://doi.org/10.1016/j.media.2020.101908
GNU Affero General Public License v3.0
168 stars 48 forks source link

How to interpret GMIC's prediction result on CBIS-DDSM ? #26

Closed HongjianLi closed 1 year ago

HongjianLi commented 2 years ago

Hi. We tried to reproduce the result described in the GMIC paper on CBIS-DDSM, but did not seem to make it. Below are what we did:

image_index benign_pred malignant_pred benign_label malignant_label
0_L-CC 0.1356 0.0081 0 0
0_R-CC 0.1747 0.0323 1 0
0_L-MLO 0.2368 0.0335 0 0
0_R-MLO 0.0696 0.0104 1 0
1_L-CC 0.0508 0.0144 0 0
1_R-CC 0.0515 0.0087 0 1
1_L-MLO 0.0545 0.0154 0 0
1_R-MLO 0.1115 0.0149 0 1
2_L-CC 0.0746 0.0160 0 0
2_R-CC 0.0809 0.0228 1 0
2_L-MLO 0.0953 0.0086 0 0
2_R-MLO 0.1155 0.0168 1 0
3_L-CC 0.2134 0.0407 0 1
3_R-CC 0.2945 0.2116 0 0
3_L-MLO 0.1639 0.0165 0 1
3_R-MLO 0.0722 0.0303 0 0

We wonder which part went wrong?

The five pretrained models provided in the models directory were trained on the NYUCBS dataset, which is proprietary and thus unavailable to us. Do we have to retrain GMIC on CBIS-DDSM in order to get good result on CBIS-DDSM? If so, how to perform re-training? Where can we find the code to retrain?

Thank you.

aisosalo commented 1 year ago

I have found the closed issues to answer most of my questions regarding how to use GMIC. I didn't look at the snapshot, but what I can say in general is that the pre-processing steps are important, e.g., the cropping etc.

seyiqi commented 1 year ago

hi @HongjianLi ,

Thanks for your interest in GMIC. The preprocessing steps you took on CBIS-DDSM seem reasonable. We applied a different preprocessing strategy for DDSM than the preprocessing provided in this repo. This paragraph describes our preprocessing steps for CBIS-DDSM dataset: "To preprocess mammography images in CBIS-DDSM, we first found the largest connected component containing only non-zero pixels to locate the breast. We then applied erosion and dilation to refine the breast margin. Lastly, we re-oriented all mammography images so that the breasts are always on the left side of the image." You can find more information about preprocessing in section 3.1 of our paper: https://www.sciencedirect.com/science/article/pii/S1361841520302723.

Regarding the predictions, we didn't apply GMIC with weights trained on the NYU dataset on the DDSM dataset. There are several contrasting differences between the images CBIS-DDSM dataset and modern FFDM images.

Hope this helps.

Best,