HKU-MedAI / GEM-3D

Generative Enhancement for 3D Medical Images
GNU General Public License v3.0
56 stars 4 forks source link

Hi, is the code for the preprocessing of the raw dataset available? #2

Closed EvelynChengG closed 4 months ago

EvelynChengG commented 5 months ago

I saw the json file, but I don't know how to use it.

微信图片_20240603002559
Advocate99 commented 5 months ago

Hi, sorry for the late reply. Here what you download is the data I have already preprocessed. (this link: https://connecthkuhk-my.sharepoint.com/personal/ltzhu99_connect_hku_hk/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fltzhu99%5Fconnect%5Fhku%5Fhk%2FDocuments%2Fdata%2Ezip&parent=%2Fpersonal%2Fltzhu99%5Fconnect%5Fhku%5Fhk%2FDocuments&ga=1) After downloading you do not need to further preprocess but just do the training or inference. You can check the code of dataloader (https://github.com/HKU-MedAI/GEM-3D/blob/main/ldm/data/volume_dataset.py), it just load the npz files in the directory.

For the orginal preprocess of Brain dataset, I extract one modality of the dataset and some other naive operations include changing the names of the files. Then use the preprocess command of nnunet v2, which can be found in their repo. For the abdomen, I further exclude blank zones of the CT volumes by thresholding the non-zero masks. After extracting the volumes, also use the command of nnunet v2. Those operations are messy and naive, not important to be well-arranged. Here for simple use and fair comparision, I put the datasets in the drive.

For new datasets, just prepare the data and use the command of nnunet v2 to get the similar structure. Then can modify the dataloader file and run the training in the similar way.