nyukat / GMIC

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization
https://doi.org/10.1016/j.media.2020.101908
GNU Affero General Public License v3.0
163 stars 48 forks source link

how to create exam for data list? #19

Closed emma-sjwang closed 3 years ago

emma-sjwang commented 3 years ago

as title

emma-sjwang commented 3 years ago

Could you provide a exam list for public dataset CBIS-DDSM? The initial data format of CBIS-DDSM is '.dcm' rather than '.png'.

seyiqi commented 3 years ago

Hi @emma-sjwang ,

Here is an example of creating a data_list. It's essentially a list of dictionaries each of which contains metadata for a single mammogram.

mammogram_1 = {'horizontal_flip': 'NO', # 'YES' or 'NO' 'L-CC': ['3_L-CC'], # file name of the LCC image 'L-MLO': ['3_L-MLO'], 'R-MLO': ['3_R-MLO'], 'R-CC': ['3_R-CC'], 'cancer_label': {'benign': 0, 'right_benign': 0, 'malignant': 1, 'left_benign': 0, 'unknown': 0, 'right_malignant': 0, 'left_malignant': 1}, 'L-CC_benign_seg': ['3_L-CC_benign'], # file name of the segmentation only for visualization purpose, supply None if you don't need visualization 'L-CC_malignant_seg': ['3_L-CC_malignant'], 'L-MLO_benign_seg': ['3_L-MLO_benign'], 'L-MLO_malignant_seg': ['3_L-MLO_malignant'], 'R-MLO_benign_seg': ['3_R-MLO_benign'], 'R-MLO_malignant_seg': ['3_R-MLO_malignant'], 'R-CC_benign_seg': ['3_R-CC_benign'], 'R-CC_malignant_seg': ['3_R-CC_malignant']}

data_list = [mammogram_1, mammogram_2, ...]

For CBIS-DDSM, you need to extract the image matrix out of these DICOM files and converted them into png files.

emma-sjwang commented 3 years ago

Thank you so much for your answer. I find that the exam in the public DDSM dataset sometimes has multi-screening for the same breast on the same view. It seems that the designed dataset structure in the proposed code only contains 4 situations.

So how do you deal with such a condition when one patient has more than 4 screening images or less than 4?

emma-sjwang commented 3 years ago

Could you please release the pkl file for the public DDSM dataset?