Closed Huiimin5 closed 1 year ago
Thanks for your question! We use the same data as shown in (1). As described, the dataset contains 47 subjects from the BTCV with segmentations of all organs except the duodenum. The images and labels are accessed at https://zenodo.org/record/1169361#.YSO98iYpDCI. In this work, we use 30 of them that have labels of both kidneys. We split them into two folds, 21 for training and 9 (id: 32-40) for validation.
(1) Gibson, Eli, et al. "Automatic multi-organ segmentation on abdominal CT with dense V-networks." IEEE transactions on medical imaging 37.8 (2018): 1822-1834.
Thank you so much for your response.
Basically for dataset preparation, suppose I assign class 1 to liver, 2 to kidney and 3 for spleen, on lits, I should convert both label 1&2 to 1; on kits, I should convert label 1 and 2 to 2; on spleen, I should convert label 1 to 3 and on BTCV, I should concert 2&3 (left&right kidney) to 2, 1 (spleen) to 3 and 6 (liver) to 1, 4-14 to 0, right? In this way I can get labels that can reproduce your results here:
right?
Another question is that I notice in train_sf_partial.py you use a checkpoint:
May I know where it comes?
For the first question, yes, I did the same data processing as you said. For the checkpoint question, sometimes I use a pre-trained PIPO model and then finetune the PIPO-FAN. I mainly use fine-tuning in liver and kidney segmentation. You can ignore the three rows code that loads the pre-trained model but train PIPO-FAN from scratch. Thanks!
Thank you so much for your response.
Could you please also release segment_sf.py ? I only see segment_sf_partial.py in current Repository.
Please refer #6, the two are the same. Only several lines need to be modified based on the segmented datasets. Thanks!
Thank you so much for your help.
I followed all data organization and preprocessing steps but when training wtih BTCV only and testing with BTCV, the performance I get is
0.94835879 0.78651779 0.89205532 for liver, kidney and spleen,
which is much lower than the result you reported in Table II.
I will train the model multiple times and check the average but I see some random operations in this training pipeline that could cause variations in final results. Did you observe large variations between different runs?
Have you done connected component analysis for the three segmentations? For the liver and spleen, the largest component is preserved. For the kidney, the largest two components are preserved.
Yes I have. Actually I did not change your postprocessing in segment_sf_partial.py
Here is the performance before and after postprocessing:
Okay, have you checked the dice for each case and visualize them? The dice of the kidney seem very low.
Did you observe high variation in your results? I train the same model for 3 times and here are results I get: [0.96183087 0.89196719 0.90897074] [0.94835879 0.78651779 0.89205532] [0.95781451 0.86183708 0.84043249]
It seems kidney and spleen vary more severely. Did you get good result after one run? or the average performance is good in your environment?
I got most of the experiments after one run. The performance on single BTCV datasets may vary due to the small scale. But I obtained similar performance when there are multiple runs too. Have you also tried training the network on the combination of partially labeled datasets and observe a similar phenomenon?
Training with Lits + kits + spleen, I get: 0.94083299 0.81786115 0.88617778 0.94791603 0.79610985 0.92036824 0.95478062 0.89075209 0.91447859
Training wtih BTCV+Lits + kits + spleen, I get:
0.63565062 0.71102741 0.81192771
0.9505721 0.86771989 0.93720144
and an error:
Thanks for letting me know. I suggest you also run the ResUNet. I am not sure if this variance was caused by the training policy (I may use a pre-trained model at the start of training). The ResUNet is also provided by us, you can simply import that network. When available, I will also look for some saved checkpoints to rerun the evaluation.
he first question, yes, I did the same data processing as you said. For the checkpoint question, sometimes I use a pre-trained PIPO model and then finetune the PIPO-FAN. I mainly use fine-tuning in liver and kidney segmentation. You can ignore the three rows code that loads the pre-trained model but train PIPO-FAN from scratch. Thanks!
Thank you so much for your response. Basically for dataset preparation, suppose I assign class 1 to liver, 2 to kidney and 3 for spleen, on lits, I should convert both label 1&2 to 1; on kits, I should convert label 1 and 2 to 2; on spleen, I should convert label 1 to 3 and on BTCV, I should concert 2&3 (left&right kidney) to 2, 1 (spleen) to 3 and 6 (liver) to 1, 4-14 to 0, right? In this way I can get labels that can reproduce your results here:
right?
Another question is that I notice in train_sf_partial.py you use a checkpoint:
May I know where it comes?
hello,sorry to bother you,I try to modify in the training file like this:
if(data_type=='1'):
label_batch[label_batch == 1] = 1 # liver
label_batch[label_batch == 2] = 1 # liver
if(data_type=='2'):
label_batch[label_batch == 1] = 2 #kid
label_batch[label_batch == 2] = 2 #kid
if(data_type=='3'):
label_batch[label_batch == 1] = 3 #spleen
if(data_type=='4'):
label_batch[label_batch == 1] = 3 # spleen
label_batch[label_batch == 2] = 2 # rkid + lkid
label_batch[label_batch == 3] = 2 # rkid + lkid
label_batch[label_batch == 4] = 0 #
label_batch[label_batch == 5] = 0 #
label_batch[label_batch == 6] = 1 # liver
label_batch[label_batch > 6] = 0
like this:
but while training there is an error: ../aten/src/ATen/native/cuda/NLLLoss2d.cu:93: nll_loss2d_forward_kernel: block: [2,0,0], thread: [384,0,0] Assertion t >= 0 && t < n_classes failed. ../aten/src/ATen/native/cuda/NLLLoss2d.cu:93: nll_loss2d_forward_kernel: block: [2,0,0], thread: [385,0,0] Assertion t >= 0 && t < n_classes failed. ../aten/src/ATen/native/cuda/NLLLoss2d.cu:93: nll_loss2d_forward_kernel: block: [2,0,0], thread: [386,0,0] Assertion t >= 0 && t < n_classes failed.
I hope to know where is the problem with my settings
Hi, in your paper when introducing BTCV, you mentioned "The BTCV segmentation challenge dataset contains 47 subjects with segmentation of all abdominal organs except duodenum". But on the official website, they release 50 subjects' CT scans. Could you please specify which 3 are ignored?
Also could you specify which 9 are used for validation in order to obtain the results in Table II?