How to interpret GMIC's prediction result on CBIS-DDSM ?

HongjianLi commented 2 years ago

Hi. We tried to reproduce the result described in the GMIC paper on CBIS-DDSM, but did not seem to make it. Below are what we did:

We downloaded CBIS-DDSM from https://wiki.cancerimagingarchive.net/display/Public/CBIS-DDSM, along with the four csv files (Mass/Calc, Training/Test)
The paper mentions that GMIC was evaluated on only a subset of CBIS-DDSM, which contains 188 exams defined by Shen et al. We identified and extracted this subset.
The sample_data/images contains 4 exams each of which includes 4 the original mammography images (L-CC, L-MLO, R-CC, R-MLO). Specifically, 0_R-CC, 0_R-MLO, 2_R-CC, 2_R-MLO have a benign_label of 1; 1_R-CC, 1_R-MLO, 3_L-CC, 3_L-MLO have a malignant_label of 1. To satisfy this configuration, we selected four exams, from the 188-exams subset, to have the same configuration. As a result, the selected four exams were P_02409, P_00146, P_01678, P_01669. The images were in DICOM format.
We used the python code snippet described in the metarepository's README to convert DICOM to PNG. The bitdepth parameter was set to 16. https://github.com/nyukat/mammography_metarepository#images
After DICOM-to-PNG conversion, we replaced the corresponding png files in sample_data/images by the converted png files of the four selected exams from CBIS-DDSM:
- 0_L-CC: Unaltered
- 0_L-MLO: Unaltered
- 0_R-CC: Replaced by CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_02409_RIGHT_CC/08-07-2016-DDSM-41108/1.000000-full mammogram images-67359/1-1.png
- 0_R-MLO: Replaced by CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_02409_RIGHT_MLO/08-07-2016-DDSM-46691/1.000000-full mammogram images-54510/1-1.png
- 1_L-CC: Unaltered
- 1_L-MLO: Unaltered
- 1_R-CC: Replaced by P_00146 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_00146_RIGHT_CC/07-20-2016-DDSM-61365/1.000000-full mammogram images-07790/1-1.png
- 1_R-MLO: Replaced by P_00146 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_00146_RIGHT_MLO/07-20-2016-DDSM-90212/1.000000-full mammogram images-33341/1-1.png
- 2_L-CC: Unaltered
- 2_L-MLO: Unaltered
- 2_R-CC: Replaced by P_01678 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_01678_RIGHT_CC/08-07-2016-DDSM-63063/1.000000-full mammogram images-39590/1-1.png
- 2_R-MLO: Replaced by P_01678 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_01678_RIGHT_MLO/08-07-2016-DDSM-33342/1.000000-full mammogram images-59283/1-1.png
- 3_L-CC: Replaced by P_01669 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_01669_LEFT_CC/07-20-2016-DDSM-68732/1.000000-full mammogram images-80465/1-1.png
- 3_L-MLO: Replaced by P_01669 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_01669_LEFT_MLO/07-20-2016-DDSM-14752/1.000000-full mammogram images-57568/1-1.png
- 3_R-CC: Unaltered
- 3_R-MLO: Unaltered. Note that eight files remained unaltered because their benign_label and malignant_label are both 0, and CBIS-DDSM has no normal images to substitute. Here is a snapshot of the 16 input image: https://freeimage.host/i/irlidu
We executed run.sh, and then got the output predictions.csv:

image_index	benign_pred	malignant_pred	benign_label	malignant_label
0_L-CC	0.1356	0.0081	0	0
0_R-CC	0.1747	0.0323	1	0
0_L-MLO	0.2368	0.0335	0	0
0_R-MLO	0.0696	0.0104	1	0
1_L-CC	0.0508	0.0144	0	0
1_R-CC	0.0515	0.0087	0	1
1_L-MLO	0.0545	0.0154	0	0
1_R-MLO	0.1115	0.0149	0	1
2_L-CC	0.0746	0.0160	0	0
2_R-CC	0.0809	0.0228	1	0
2_L-MLO	0.0953	0.0086	0	0
2_R-MLO	0.1155	0.0168	1	0
3_L-CC	0.2134	0.0407	0	1
3_R-CC	0.2945	0.2116	0	0
3_L-MLO	0.1639	0.0165	0	1
3_R-MLO	0.0722	0.0303	0	0

We were confused by the above result. The eight CBIS-DDSM-substituted images had very low probability values for both benign_pred and malignant_pred. For instance,
- 0_R-CC and 0_R-MLO have a benign_label of 1, but their benign_pred values are just 0.1747 and 0.0696.
- 3_L-CC and 3_L-MLO have a malignant_label of 1, but their malignant_pred values are just 0.0407 and 0.0165.

We wonder which part went wrong?

The five pretrained models provided in the models directory were trained on the NYUCBS dataset, which is proprietary and thus unavailable to us. Do we have to retrain GMIC on CBIS-DDSM in order to get good result on CBIS-DDSM? If so, how to perform re-training? Where can we find the code to retrain?

Thank you.

aisosalo commented 2 years ago

I have found the closed issues to answer most of my questions regarding how to use GMIC. I didn't look at the snapshot, but what I can say in general is that the pre-processing steps are important, e.g., the cropping etc.

seyiqi commented 2 years ago

hi @HongjianLi ,

Thanks for your interest in GMIC. The preprocessing steps you took on CBIS-DDSM seem reasonable. We applied a different preprocessing strategy for DDSM than the preprocessing provided in this repo. This paragraph describes our preprocessing steps for CBIS-DDSM dataset: "To preprocess mammography images in CBIS-DDSM, we first found the largest connected component containing only non-zero pixels to locate the breast. We then applied erosion and dilation to refine the breast margin. Lastly, we re-oriented all mammography images so that the breasts are always on the left side of the image." You can find more information about preprocessing in section 3.1 of our paper: https://www.sciencedirect.com/science/article/pii/S1361841520302723.

Regarding the predictions, we didn't apply GMIC with weights trained on the NYU dataset on the DDSM dataset. There are several contrasting differences between the images CBIS-DDSM dataset and modern FFDM images.

Hope this helps.

Best,

nyukat / GMIC

How to interpret GMIC's prediction result on CBIS-DDSM ? #26