vbookshelf / Mammogram-Mass-Analyzer

A desktop breast cancer detection app. This is a free desktop computer aided diagnosis (CAD) tool that uses computer vision to detect and localize masses on full field digital mammograms.
MIT License
7 stars 3 forks source link

BIRADS usage #1

Open blazespinnaker opened 1 year ago

blazespinnaker commented 1 year ago

Hey vbookshelf, really great work. Thanks for sharing.

I was wondering - why don't you use BIRADS more in training? You seem to instead use 'Mass', which is very infrequently an indicator of malignancy (10%?) and in 1/3 cases of BIRADS 5 it doesn't have a Mass in the finding.

Also, you have nc=2 classes in the yolo model, but only really train for one. Was there a reason for that? May or may not be a problem, not sure how Yolo would deal with that.

Anyways, be great to see you more on Kaggle and in the RSNA comp, at least in the discussion area! cheers

vbookshelf commented 1 year ago

Hi @blazespinnaker. Thanks.

1- My main aim for this project was to create a desktop app rather than a good cancer detector. There's too little data to create a robust solution so I wanted to create something that would guide the eye of the radiologist to a region of interest rather than something that would produce a definitive diagnosis. I also wanted to use Yolov5.

I chose to use only mass because that was the class for which the most data was available and it was something that I knew Yolov5 would have a good chance of detecting. My later experiments showed that Yolov5 struggles to detect calcifications. This is probably because there's too little data and the calcifications are very small.

2- I used two classes because this is what I used when I originally set up my yolov5 workflow. That was a while ago when I was learning to use Yolo. I've used the same workflow on other problems and it didn't appear to cause any problems.

Thanks for your interest in this project.

blazespinnaker commented 1 year ago

Is there too little data? I'm not so sure. If you use birads 4 and 5, that comes to over 1200 images I believe. With appropriate augmentation and thresholding you can probably get reasonably interesting, if somewhat false positives. Some further classification of the patches of course would be ideal via a second stage model.

Your note about suspicious calsifications is interesting being too small, doing a quick test I see that the mean area of those are the same as the mean area of the entire set, about 112066.518372 sq pixels.

I guess what perked my curiousity is that I think in your model you're ignoring birads 5 in favor of some birads 3 masses, which seems to be the opposite of what you might want to be doing.

Certainly in the literature, yolo is being used quite a bit, and it seems fairly successful.

vbookshelf commented 1 year ago

I'm not a domain expert but I think BIRADS classifications include a combination of factors - mass, calcifications, architectural distortion and a fair bit of judgement from the radiologist. Therefore, I'm guessing that there's not enough positive samples in the data for the model to capture these complexities.

But, I think your idea is a very interesting experiment to try. I didn't consider using patches. When I tried using Yolo to detect calcifications, it didn't detect any. When using Yolov5l, reducing the number of false positives seemed to be the only way to improve the score. The score can be improved by using Yolov5x, but this doubles the CPU inference time.

Also, Yolov5 also has the capability to use focal loss. It may help improve the results on high density breast images.

I haven't reviewed the literature to see how Yolo is being used to classify x-rays. My main aim was to develop the simplest working model possible - that could be easily deployed, and perform CPU inference within a reasonable time.

blazespinnaker commented 1 year ago

Thanks for the feedback, vbook. Really great stuff. I still think you'd really be huge in the RSNA competition, even if you're not going for gold. Really hope you'll join us, your ideas and efforts need more exposure, imho.