Inquiry on Modifying MedYOLO for Pulmonary Embolism Detection

Hello,

I hope this message finds you well. I am currently working on a project that involves the detection of pulmonary embolism, which is a small target in medical images. I am interested in using MedYOLO for this purpose, but I would like to optimize the network structure to better suit my specific needs.

Here are my questions:

How can I modify the network architecture of MedYOLO to enhance its performance for detecting small targets like pulmonary embolisms? Can I apply general optimization methods from YOLOv5 to MedYOLO, or are there any specific considerations I should be aware of when adapting it for medical imaging? I believe that with the right adjustments, MedYOLO could be a powerful tool for this application. Any guidance or suggestions you could provide would be greatly appreciated.

Thank you for your time and expertise.

From some discussion with other users, I think the best strategy is to modify the data and not the network. I think the problem with small objects comes from the downsampling stages preventing small objects from having a strong signal deep in the network, but the downsampling stages are also a big part of what makes the network work. I think patchifying the data, so the small objects become relatively larger objects, has a decent chance of working, but I haven't tested it nor heard from others who were going to try it whether it worked. Unfortunately that pipeline becomes a bit more complex, but you could use MedYOLO to detect the lung volume, crop the image to the detected volume, and pass the cropped image into a script to convert it and your labels into patches with their corresponding labels, then try training the pulmonary embolism MedYOLO on those patches.

The biggest issue with trying to modify the network itself is that a lot of the code the model interfaces with, like the anchor and NMS code and even the Detect module itself, is obtuse. I don't think I would bother trying to modify the architecture (for the amount of effort it would take you'd be better off designing one from scratch and publishing a paper on that), but the model hyperparameters should be relatively straightforward to modify in the model yaml files and training configuration files.

What I have tried to do is encapsulate some of the more common, non-architectural, code that people might want to modify. If you want to change how data is processed before being sent to the model (for example, by windowing it), you can modify the functions called by this dataset method, and if you want to change the way images are normalized, you can implement custom normalization. The dataset also has the primary place you would implement new augmentation labeled (though YOLOv5 has that too) so it's more obvious where to change the augmentation routines.

One thing to watch out for is that most of YOLOv5's augmentations are done using OpenCV, and, at least back when I put MedYOLO together, OpenCV only supports PNG-like images. Trying to reproduce those operations with a 3-D compatible method, and duplicating the operations on the labels (e.g. for rotation) in a way that correctly translates the changes as well as operates quickly, is non-trivial.

I would probably focus on modifying the easy parts of the framework to modify and patchifying the data. Those should be less effort to try and I'd at least make sure those "easy" things gave me some sign the network was detecting PE before trying to do anything more fundamental.

JDSobek / MedYOLO

Inquiry on Modifying MedYOLO for Pulmonary Embolism Detection #25