nyukat / mammography_metarepository

Meta-repository of screening mammography classifiers
https://arxiv.org/abs/2108.04800
BSD 2-Clause "Simplified" License
65 stars 11 forks source link

Pre-processing method #20

Closed cyh-0 closed 2 years ago

cyh-0 commented 2 years ago

Hi Guys,

When I am testing the pre-processing algorithm on my dataset I found that in some cases the MLO mammograms are not well-repersented. Just wondering have you guys considered how to avoid these cases?

Cheers Screenshot from 2021-12-05 18-03-53

chledowski commented 2 years ago

Hi!

Could you give some details, i.e. which images have this problem?

cyh-0 commented 2 years ago

A few MLO images in my dataset, I think the muscle plane at the bottom causes the trouble.

image

chledowski commented 2 years ago

Hi.

Looks like there might be several problems with these images: 1) Having a strictly 0-value background is a must for our algorithm to work - seems like the background is not strictly 0 on those images. 2) The images are possibly too large - you might need to either downscale the original image or use a bigger cropping window. 3) The number of iterations in morphological operations must increase depending on whether you downscaled the original image or used a bigger cropping window - --num-iterations parameter in https://github.com/nyukat/GMIC/blob/master/src/cropping/crop_mammogram.py. The morphological operations should get rid of occasional artifacts in the background as well as stomach, but the default value won't be enough for images with higher resolutions.

We will discuss adjusting the preprocessing pipeline to work in such cases.

jwitos commented 2 years ago

I'll just add that it would be easier for us to troubleshoot if you could share at least a de-identified original DICOM(s) so that we can have a minimal, complete reproducible example. Without it we can make a few guesses as you can see above but it's difficult to find the ultimate problem

cyh-0 commented 2 years ago

Hi @jwitos and @chledowski Thanks for reaching out, I am afraid I cannot send you guys the de-identified DICOM files. I neither have the access to the original files nor have the permission to do it. I guess I would just follow the advice suggested by @chledowski. Thank you guys so much!