Closed FrancescVP closed 1 year ago
We should take this into consideration, we are exclusively using the segmented organ to predict injury. Other organs, the full picture could play an important role in this prediction, which we are now going to dismiss. It would be a good idea to do an experiment once we have the evaluation framework to test where if this information plays a role.
@FrancescVP I will need a brief explanation about what the get_args
in the preprocessor class is exactly doing.
There are some organs, such as the kidney, that presents two segmentations (one for the right another for the left). We still have to decide how to deal with this situation
Organs can present one of the following sets of labels:
organ_healthy
, organ_low
, organ_high
organ_healthy
, organ_injury
we will have to adapt also the trainer to adapt the number of classes depending on the number of labels returned by the data loader.
But amazing PR bro really cool stuff, looking forward you explain it to me with more detail.
The first thing we should do is to get a better understanding of the images since this will answer this kind of questions. We should do a proper study of how injured organs look like and how they affect its environment. It would be useful to analyze if injured organs are larger than healthy, we can do it by extracting the volume of the organs from the segmentation files and the normalizing it by some healthy organ, like the liver, to extract the expected variability of the organ volume.
The get_args
function just loads the arguments from the json file in an efficient way when doing it multiple times. Rather it is true that we just load the file once then it's a bit useless jeje :)
Not a problem, single-organ models will have only one channel (its image) while two-organ moldels will have two channels (both kidneys) --> (1, 112, 112, 112) vs (2, 112, 112, 112)
When building the model we'll specify the number of classes that the model should expect. We can build a dict variable that can manage this issue.
This is a multi-centric study, where image come from multiple sites where different adquisition protocols are applied, resulting on a variety of image intensities. Apart from that, some centers acquire larger images where all organs are covered while others no. This issue have been adressed in the following notebook.
One approach that we can adopt, as a first measure, is in the dataloader
, in the transformation section, we can use the IntensityNomalization function, providing the mean hu and the std hu to normalize the images.
When finishing the preprocessing process, we should consider saving the images in .npy
format instead of .nii.gz
since its more computationally efficient in terms of model training. Nifti format is better when doing preprocessing, once finished it's better to migrate to another format.
Upgrading data processing by: