JJGO / UniverSeg

UniverSeg: Universal Medical Image Segmentation
Apache License 2.0
481 stars 49 forks source link

[Question] Data Preprocessing Protocol - Mid slice, axes extraction, and aspect ratio change for training and evaluation #28

Closed xk-huang closed 2 months ago

xk-huang commented 2 months ago

Thanks for open-sourcing the code!   I am trying the reimplement UniverSeg but I have some questions about data preprocessing.   1) Mid slice   In the supp., I found that only mid-slices are used for training and evaluation. But I found that in some datasets (e.g., BTCV), the mid slice is not optimal as it misses some other organs.   Here is an example of mid slice vs. its latter slice:   mid-slice   latter-slice   I am wondering how this protocol choice would affect the evaluation?   2) Three axis extraction   In the paper, all mid slices from the 3 axes are extracted for training and evaluation. But some slices from x-z and y-z planes could be distorted. Do you still compute the metrics on them?   Still some samples of mid slices for x-z and y-z planes from BTCV:   `x-z` and `y-z` planes   3) Aspect ratio change Specifically, the resolution of BTCV is (512,512,147), which means the x-z or y-z plane images are resized from (512,147) to (128, 128). It raises another question that how the resize distortion affect the final performance.       Thanks in advance for your time and reply!

VictorButoi commented 2 months ago

Greetings! To provide answers to your three questions:

  1. It's true that the midslice of the volume could potentially miss some organs, and that perhaps one could consider evaluating on all slices of the 3D volume. We only chose to do midslices because we didn't want to bias our evaluation.

  2. This comes down to an issue in voxel-scaling of the original NIFTI files. We resample the NIFTIs using their header information so that everything is (1,1,1) for both the image and label volume.

  3. Before we resize our image/volumes, we first pad them to be square to avoid distortions. Thus, for your BTCV example, first we would pad the volume to be (512,512,512) and then resize to (128,128,128).

xk-huang commented 2 months ago

Thanks again for the speedy reply!