Hi, is the code for the preprocessing of the raw dataset available?

Hi, sorry for the late reply. Here what you download is the data I have already preprocessed. (this link: https://connecthkuhk-my.sharepoint.com/personal/ltzhu99_connect_hku_hk/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fltzhu99%5Fconnect%5Fhku%5Fhk%2FDocuments%2Fdata%2Ezip&parent=%2Fpersonal%2Fltzhu99%5Fconnect%5Fhku%5Fhk%2FDocuments&ga=1) After downloading you do not need to further preprocess but just do the training or inference. You can check the code of dataloader (https://github.com/HKU-MedAI/GEM-3D/blob/main/ldm/data/volume_dataset.py), it just load the npz files in the directory.

For the orginal preprocess of Brain dataset, I extract one modality of the dataset and some other naive operations include changing the names of the files. Then use the preprocess command of nnunet v2, which can be found in their repo. For the abdomen, I further exclude blank zones of the CT volumes by thresholding the non-zero masks. After extracting the volumes, also use the command of nnunet v2. Those operations are messy and naive, not important to be well-arranged. Here for simple use and fair comparision, I put the datasets in the drive.

For new datasets, just prepare the data and use the command of nnunet v2 to get the similar structure. Then can modify the dataloader file and run the training in the similar way.

HKU-MedAI / GEM-3D

Hi, is the code for the preprocessing of the raw dataset available? #2