Query regarding preprocessing

Kaiseem / DAR-UNet

[JBHI2022] A novel 3D unsupervised domain adaptation framework for cross-modality medical image segmentation

Apache License 2.0

41 stars 5 forks source link

Query regarding preprocessing #11

Closed SumayyaInayat closed 8 months ago

SumayyaInayat commented 1 year ago

Hi Kaiseem, First of all, you have done great job, Congrats!!! Secondly thank you so much for sharing such a clean code, really appreciate it. It saves a lot of time when such clean code is available, easy to understand. My query is that, in read me file it is stated to normalize each volume to [0,1], should the min max for this normalization be the min max of whole data or just that particular volume in process?

Please verify this.

Thanks alot.

Kaiseem commented 1 year ago

Thank you for your interest in our work. To clarify, our min-max is applied to each volume individually because the test set might be inputted one volume at a time (we cannot assume the test set contains multiple volumes).

SumayyaInayat commented 1 year ago

Thanks for your response! I am new to the field, so just trying to understand.

Shouldn't the normalization be according to whole data, I mean this way the max of whole data and the max of a volume, that can be lower than the full data max, both will be mapped to a same value. Will it be correct?

Kaiseem commented 1 year ago

What you said is correct in some situations. For natural images (RGB, with a value range of 0-255), we can use the mean and standard deviation of the entire dataset to normalize both the training and test sets because their value range is relatively well-defined. However, for medical images (grayscale, with a value range of 0-65535), the value range of each volume can vary significantly. Some volumes may have a maximum value of 1300, while others may have a maximum value of 700, depending on the parameters set by the doctor during the imaging process.

Therefore, I suggest performing volume-wise normalization for medical images. Additionally, I recommend using percentile-based normalization for each volume (i.e., vmax = np.percentile(volume.flatten(), 99.8)) to avoid issues with outliers.

SumayyaInayat commented 1 year ago

Thanks alot KaIseem ! Now I get it!!

SumayyaInayat commented 1 year ago

Hi KaIseem, Hope you are doing well. I just saw that you have provided data preparation file for abdominal dataset. Can you please provide it for CrossModa data. I will be very thank full.

One more question, is the above normalization criteria valid for training data also where there are many volumes and not just one as in case of test data.

Thanks!

Kaiseem commented 1 year ago

Hi, I update the preparation code, which is consistent with my text description in the preprocessing instruction.

Meanwhile, the volume-wise normalization should work for both training and test data. This should be a general trick, feel free to use it.