Closed lphilomena closed 3 years ago
Hello,
Many thanks for the questions. I guess there might be the following reasons if you results were significant dropped: 1) The model is not trained to be convergent. The ViT-based model requires more training iterations. 2) The results reported in Table. 1 in our paper is implemented with image size of 224. I would suggest using the original image size. In our ablation, TransUNet gains 6.8% DSC improvement if using original image size of 512, and TransUNet-512 still outperforms the AttnUNet-512 by 3% DSC.
Of course it is my pleasure to help you to locate the problem. I can try to run your own data if time permitting.
Thanks for your work! I run the code with our own data for multiple organ segmentation, the results are much worse than those in your paper. May I send our preprocessed data to you to run for a fair comparision?