yhygao / UTNet

Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation
MIT License
179 stars 27 forks source link

data processing #1

Closed xiaoiker closed 2 years ago

xiaoiker commented 2 years ago

Hi,

I come here from your comments to the nnFormer https://github.com/282857341/nnFormer/issues/18 . I found that there is not a big difference compared your mehtod to nnFormer. Therefore, I'd like wo learn more from your project.

But you didnot provide too much about the data processing. Does it mean the preprocessing is very simple? Also you said that you test on ACDC dataset. Could I ask what kind of data processing you are using there. Because the provided method there is very complicated I think. Also they mentioned that their work can not be applied to 2D input.

Best, Wei

yhygao commented 2 years ago

Hi Wei,

Thank you for your interest in our work.

We follow the resample setting of nnUNet. As our method is a 2D method, the z-axis spacing is left unchanged. For xy-axis, the spacing are resampled to the median spacing of the training set. For ACDC, all other preprocessing are the same as what we've done in the M&M dataset, please see in dataset_domain.py

Although nnFormer is a 3D method, it's not a big deal. We can still compare the performance in the patient level.

Best, Yunhe

xiaoiker commented 2 years ago

Thanks for your kindly reply. This helps a lot.

Best, Wei

xiaoiker commented 2 years ago

Sorry to bother again.

I am trying to run your code. Seems the current dataset from MM challenge2020 is different from your dataset. From the data_domain.py, I learnt that in your dataset there are many samles, so first you get the id for each sub_folder (e.g., ), but now there is only one sample under each subfolder and the data size is (25, 13, 256, 216). So does this 25 means 25 slices? and therefore z,y,x = 13, 256, 216 should the one in your code, right? Many thanks.

yhygao commented 2 years ago

MM dataset is different from ACDC. The data provided in MM is raw cine-MRI that contains frames along time. If you take a look at the ground truth image, you'll find that only two frames have labels, that are ES and ED. So for MM, I extract ES and ED frame out as two single .nii files. For your example, one sample with size (25, 13, 256, 256), the 25 is time-axis, 13 is z-axis, 256 are xy-axis. I first find the ES and ED frame along time-axis from the ground truth label, and extract those two frames out as XXX_sa_1.nii.gz and XXX_SA_2.nii.gz, and their corresponding labels to be XXX_sa_gt_1.nii.gz and XXX_sa_gt_2.nii.gz.