nyukat / breast_cancer_classifier

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening
https://ieeexplore.ieee.org/document/8861376
GNU Affero General Public License v3.0
844 stars 269 forks source link

Predict on DDSM #13

Closed Chunlwu closed 5 years ago

Chunlwu commented 5 years ago

Firstly, Thanks for your great work on mammogram classification. Recently, I tried to predict your model on a public dataset(DDSM, and CBIS-DDSM). But I found the result is always predicted as BENIGN. Below is a sample case for your reference

Predicted by model (only image): left_benign right_benign left_malignant right_malignant 0.2456 0.3804 0.0131 0.0716 0.1495 0.5369 0.0180 0.1072 0.1644 0.1658 0.0338 0.0284 0.1821 0.3101 0.0121 0.0585

GoundTurth: left_benign right_benign left_malignant right_malignant 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0

I known the imbalance issue which described in #9, so I selected 2 obvious MALIGNANT cases and 1 obvious BENIGN case. And done all preprocessing which described in #9 and dataset report. But the result is still predicted as BENIGN.

So, could you have evaluated the released model (in your code: breast_cancer_classifier/models/sample_image_model.p) on DDSM or INBreast? And the other question I want to known is, what's different between DDSM with your private dataset? Thanks

kjgeras commented 5 years ago

@Chunlwu, thanks for your kind words.

First of all, did you notice that we updated the code a few weeks ago? There was a small bug in presenting the output. Please make sure that you use the code after that fix.

Regarding compatibility with DDSM, we didn't do such an experiment. I would be very curious to see your results when you are finished. Assuming that you did all of the preprocessing the same way as we did and you are using the code after the fix, there a few things to consider.

1) How the learning task is defined: in our formulation benign and malignant labels don't compete with each other. A breast can have both a benign and a malignant finding, we predict two things: "is it benign or not", "is it malignant or not", rather than "is it benign, malignant or normal". Benign cases are much more common in our data set than malignant ones, so the network will be biased to predict larger probabilities of benign findings. You need to separately evaluate networks ability to predict benign and malignant findings. Please refer to our paper for details. 2) Images in DDSM are scanned films. They will have different statistics than images in our dataset. My guess is that our network should still produce non-random predictions, but it is difficult to say how accurate it should be. This data is just quite different and there is no guarantee that the network will work well for data that doesn't come from the same distribution as the training data. For example, it is possible that the gray noise in the background of DDSM images might fool our network. 3) How the labels are defined. I'm not sure exactly what the labels in DDSM mean, but they might not necessarily have identical meaning to ours. For example, I'm not sure whether the findings we consider "high-risk benign", they wouldn't consider malignant. There might be many small differences like that.

I think probably the best way to use our network on DDSM would be to fine-tune it using DDSM to make sure that the network is adjusted to the shift in the distribution and to the shift in the definition of the labels.

Chunlwu commented 5 years ago

@kjgeras Thanks for you kindly reply~ I will finetune model on DDSM, and try again~

Chunlwu commented 5 years ago

@WajeehaAnsar You should convert LJPEG to PNG, and then add new records into exam_list_before_cropping.pkl. that's all for predicting.

picEmily commented 4 years ago

@Chunlwu do you finetune model on DDSM? what is the result of your model

bhosalems commented 2 years ago

@Chunlwu , hey did you finetune your model for DDSM?