DIAL-RPI / PIPO-FAN

PIPO-FAN for multi organ segmentation over partial labeled datasets using pytorch
MIT License
59 stars 22 forks source link

BTCV dataset #8

Closed Huiimin5 closed 1 year ago

Huiimin5 commented 2 years ago

Hi, in your paper when introducing BTCV, you mentioned "The BTCV segmentation challenge dataset contains 47 subjects with segmentation of all abdominal organs except duodenum". But on the official website, they release 50 subjects' CT scans. Could you please specify which 3 are ignored?

Also could you specify which 9 are used for validation in order to obtain the results in Table II?

xifang001 commented 2 years ago

Thanks for your question! We use the same data as shown in (1). As described, the dataset contains 47 subjects from the BTCV with segmentations of all organs except the duodenum. The images and labels are accessed at https://zenodo.org/record/1169361#.YSO98iYpDCI. In this work, we use 30 of them that have labels of both kidneys. We split them into two folds, 21 for training and 9 (id: 32-40) for validation.

(1) Gibson, Eli, et al. "Automatic multi-organ segmentation on abdominal CT with dense V-networks." IEEE transactions on medical imaging 37.8 (2018): 1822-1834.

Huiimin5 commented 2 years ago

Thank you so much for your response. Basically for dataset preparation, suppose I assign class 1 to liver, 2 to kidney and 3 for spleen, on lits, I should convert both label 1&2 to 1; on kits, I should convert label 1 and 2 to 2; on spleen, I should convert label 1 to 3 and on BTCV, I should concert 2&3 (left&right kidney) to 2, 1 (spleen) to 3 and 6 (liver) to 1, 4-14 to 0, right? In this way I can get labels that can reproduce your results here: image right?

Another question is that I notice in train_sf_partial.py you use a checkpoint: image May I know where it comes?

xifang001 commented 2 years ago

For the first question, yes, I did the same data processing as you said. For the checkpoint question, sometimes I use a pre-trained PIPO model and then finetune the PIPO-FAN. I mainly use fine-tuning in liver and kidney segmentation. You can ignore the three rows code that loads the pre-trained model but train PIPO-FAN from scratch. Thanks!

Huiimin5 commented 2 years ago

Thank you so much for your response.

Could you please also release segment_sf.py ? I only see segment_sf_partial.py in current Repository.

xifang001 commented 2 years ago

Please refer #6, the two are the same. Only several lines need to be modified based on the segmented datasets. Thanks!

Huiimin5 commented 2 years ago

Thank you so much for your help.

I followed all data organization and preprocessing steps but when training wtih BTCV only and testing with BTCV, the performance I get is 0.94835879 0.78651779 0.89205532 for liver, kidney and spleen, which is much lower than the result you reported in Table II. image

I will train the model multiple times and check the average but I see some random operations in this training pipeline that could cause variations in final results. Did you observe large variations between different runs?

xifang001 commented 2 years ago

Have you done connected component analysis for the three segmentations? For the liver and spleen, the largest component is preserved. For the kidney, the largest two components are preserved.

Huiimin5 commented 2 years ago

Yes I have. Actually I did not change your postprocessing in segment_sf_partial.py Here is the performance before and after postprocessing: image

xifang001 commented 2 years ago

Okay, have you checked the dice for each case and visualize them? The dice of the kidney seem very low.

Huiimin5 commented 2 years ago

Did you observe high variation in your results? I train the same model for 3 times and here are results I get: [0.96183087 0.89196719 0.90897074] [0.94835879 0.78651779 0.89205532] [0.95781451 0.86183708 0.84043249]

It seems kidney and spleen vary more severely. Did you get good result after one run? or the average performance is good in your environment?

xifang001 commented 2 years ago

I got most of the experiments after one run. The performance on single BTCV datasets may vary due to the small scale. But I obtained similar performance when there are multiple runs too. Have you also tried training the network on the combination of partially labeled datasets and observe a similar phenomenon?

Huiimin5 commented 2 years ago

Training with Lits + kits + spleen, I get: 0.94083299 0.81786115 0.88617778 0.94791603 0.79610985 0.92036824 0.95478062 0.89075209 0.91447859

Training wtih BTCV+Lits + kits + spleen, I get: 0.63565062 0.71102741 0.81192771 0.9505721 0.86771989 0.93720144 and an error: image

xifang001 commented 2 years ago

Thanks for letting me know. I suggest you also run the ResUNet. I am not sure if this variance was caused by the training policy (I may use a pre-trained model at the start of training). The ResUNet is also provided by us, you can simply import that network. When available, I will also look for some saved checkpoints to rerun the evaluation.

sharonlee12 commented 7 months ago

he first question, yes, I did the same data processing as you said. For the checkpoint question, sometimes I use a pre-trained PIPO model and then finetune the PIPO-FAN. I mainly use fine-tuning in liver and kidney segmentation. You can ignore the three rows code that loads the pre-trained model but train PIPO-FAN from scratch. Thanks!

Thank you so much for your response. Basically for dataset preparation, suppose I assign class 1 to liver, 2 to kidney and 3 for spleen, on lits, I should convert both label 1&2 to 1; on kits, I should convert label 1 and 2 to 2; on spleen, I should convert label 1 to 3 and on BTCV, I should concert 2&3 (left&right kidney) to 2, 1 (spleen) to 3 and 6 (liver) to 1, 4-14 to 0, right? In this way I can get labels that can reproduce your results here: image right?

Another question is that I notice in train_sf_partial.py you use a checkpoint: image May I know where it comes?

hello,sorry to bother you,I try to modify in the training file like this: if(data_type=='1'): label_batch[label_batch == 1] = 1 # liver label_batch[label_batch == 2] = 1 # liver if(data_type=='2'): label_batch[label_batch == 1] = 2 #kid label_batch[label_batch == 2] = 2 #kid if(data_type=='3'): label_batch[label_batch == 1] = 3 #spleen if(data_type=='4'): label_batch[label_batch == 1] = 3 # spleen label_batch[label_batch == 2] = 2 # rkid + lkid label_batch[label_batch == 3] = 2 # rkid + lkid label_batch[label_batch == 4] = 0 # label_batch[label_batch == 5] = 0 # label_batch[label_batch == 6] = 1 # liver label_batch[label_batch > 6] = 0 like this: image

but while training there is an error: ../aten/src/ATen/native/cuda/NLLLoss2d.cu:93: nll_loss2d_forward_kernel: block: [2,0,0], thread: [384,0,0] Assertion t >= 0 && t < n_classes failed. ../aten/src/ATen/native/cuda/NLLLoss2d.cu:93: nll_loss2d_forward_kernel: block: [2,0,0], thread: [385,0,0] Assertion t >= 0 && t < n_classes failed. ../aten/src/ATen/native/cuda/NLLLoss2d.cu:93: nll_loss2d_forward_kernel: block: [2,0,0], thread: [386,0,0] Assertion t >= 0 && t < n_classes failed.

I hope to know where is the problem with my settings