nyukat / GMIC

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization
https://doi.org/10.1016/j.media.2020.101908
GNU Affero General Public License v3.0
168 stars 48 forks source link

Discrepancies between readme and example files #15

Closed joaco18 closed 3 years ago

joaco18 commented 3 years ago

Hi! I was trying to reproduce the results for the shared example images, but it seems that the directory "sample_data" wasn't updated properly. The readme says: "As a part of this repository, we provide 4 sample exams (in sample_data/cropped_images directory and exam list stored in sample_data/data.pkl), each of which includes 2 CC view images and 2 MLO view images.", but there's no cropped_images folder in sample_data. Moreover it seems to me that the images stored in "sample_data/images" are not the "original" ones but the cropped ones. But the exam_list_before_cropping.pkl file doesn't include the "best center" coordinates. I tried using them anyway as if those were the originals because the cropping and best center steps just use the "breast region" of the image, but I couldn't reproduce your results. Maybe applying the erosion and dilation steps over the already cropped image is not allowing me to get the rightmost and bottomost pixels positions in the right manner and then my extraction of the best center is differing from your computation. If you could please upload the complete versions of the images in sample_data/images, or the exam_list.pkl file with the best_centers coordinates included, I would really appreciate it. Thank you in advance!

seyiqi commented 3 years ago

Hi @joaco18 ,

Sorry, we forgot to update the documentation. We made an update to not include the cropped images but rather include the original images (sample_data/original). We additionally uploaded the preprocessing code that crops each image (src/cropping/crop_mammogram.py) and finds the optimal center for the crops (src/optimal_centers/get_optimal_centers.py).

So for now the pipeline works as follows:

The cropped image will be saved to sample_output/cropped_imgs and the best center info should be saved to sample_output/data.pkl. You can run bash ./run_sh which will go through the entire workflow.

Best,