Beckschen / 3D-TransUNet

This is the official repository for the paper "3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers"
Apache License 2.0
182 stars 10 forks source link

Inquiry Regarding Reproduction of 3D-TransUNet Results on Brats-MET Dataset #12

Open bravo-hq opened 10 months ago

bravo-hq commented 10 months ago

Dear Authors of the 3D-TransUNet Model,

I am reaching out to seek guidance on reproducing the results presented in your recent publication, specifically those related to the Brats-MET dataset. In my work, I have used your model in my computational pipeline. The configuration parameters for the model were set as follows:

Model Configuration: I have focused on reproducing results for the encoder-only version of your model. The model's argument structure and the default values utilized are illustrated in the attached images.

model's config default values for the model Loss Function: The model employs a combined softDice and BCE loss function, weighted equally at 0.5 for each. This loss is computed individually for the three segmentation classes (WT, TC, ET) and subsequently aggregated. The formula used is: total_loss = (loss_wt + loss_tc + loss_et) / 3.0, where loss_{class_name} = 0.5 * Dice_loss + 0.5 * BCE_loss.

Dataset and Augmentation: The Brats-MET dataset has been partitioned into 70% for training, 10% for validation, and 20% for testing. I adhered to the crop size of [128,128,128] and applied the same augmentation techniques as mentioned in your study for the training dataset. Additionally, I employed MONAI's sliding window approach with a 0.5 overlap for validation and testing.

Optimizer and Training: The AdamW optimizer was used, initialized with a learning rate of 3e-4 and employing the following scheduler:

Scheduler Details The model training extended over 300 epochs, incorporating deep supervision with weight calculations as depicted: Deep Supervision Weights Issue with Deep Supervision: I encountered a challenge wherein training the model without deep supervision resulted in an 'out of index' error for the seg_outputs variable in the forward method.

Metrics: For performance evaluation, I utilized the metrics provided in the following repository, as recommended by the Challenge organization:

BraTS 2023 Metrics Repository Despite adhering closely to the methodology outlined in your paper, the lesion-based Dice metrics obtained were significantly lower than expected:

Dice for ET: 15.34 Dice for TC: 17.28 Dice for WT: 15.90 I would greatly appreciate your insights or suggestions on what might be causing this discrepancy. Is there a particular aspect of the model configuration or training process that I might need to look into or implement correctly? Your guidance in this matter would be invaluable to my research.

Thank you for your time and assistance.

Best regards, Yousef Sadegheih

xhl-video commented 9 months ago

Hi Sadegheih,

Thanks for your interest. My apologies for the delayed response. I'm curious to understand if your training loss and validation dice look normal. I didn't see the obvious differences in model configs. So I may suspect that the performance might be influenced by data preprocessing. Could you provide details on how you handled the training and test data? This information would help me assess if there are any disparities.

bravo-hq commented 9 months ago

Dear @xhl-video,

I'm sorry for not getting back to you sooner. Regarding the question about data preprocessing, we have not employed special techniques besides partitioning the dataset (we have normalized the data on each channel and followed the augmentation of nn-unet). Specifically, we have organized the data from 238 patients such that the initial 20% constitutes the test dataset, followed by 10% allocated for validation, and the remaining 70% designated for training. You can view the details of this separation in the diagram provided at the following link: Dataset Separation Diagram

Please do not hesitate to reach out if you need any more details or clarification. Also, I am open to forking and sharing my pipeline with you if you think it would be helpful.

Kind regards,

Yousef Sadegheih

Liyanglzjt commented 7 months ago

尊敬的3D-TransUNet模型的作者:

我正在寻求指导,以重现您最近出版物中提出的结果,特别是与 Brats-MET 数据集相关的结果。在我的工作中,我在我的计算管道中使用了您的模型。模型的配置参数设置如下:

模型配置:我专注于重现模型的仅编码器版本的结果。该模型的参数结构和使用的默认值如附图所示。

模型的配置默认值 损失函数:该模型采用组合的 softDice 和 BCE 损失函数,每个函数的权重均为 0.5。针对三个分段类别(WT、TC、ET)单独计算此损失,然后进行汇总。使用的公式为:,其中 .total_loss = (loss_wt + loss_tc + loss_et) / 3.0``loss_{class_name} = 0.5 * Dice_loss + 0.5 * BCE_loss

数据集和增强:Brats-MET数据集已分为70%用于训练,10%用于验证,20%用于测试。我坚持 [128,128,128] 的裁剪大小,并应用了您在训练数据集研究中提到的相同增强技术。此外,我还采用了 MONAI 的滑动窗口方法,重叠度为 0.5,用于验证和测试。

优化器和训练器:使用 AdamW 优化器,以 3e-4 的学习率初始化,并使用以下调度器:

调度程序详细信息 模型训练扩展了 300 多个周期,将深度监督与权重计算相结合,如下所述: 深度监督权重问题与深度监督:我遇到了一个挑战,即在没有深度监督的情况下训练模型会导致正向方法中变量的“超出索引”错误。seg_outputs

指标:对于性能评估,我使用了以下存储库中提供的指标,正如挑战组织所建议的那样:

BraTS 2023 指标存储库 尽管严格遵循您论文中概述的方法,但获得的基于病变的 Dice 指标明显低于预期:

ET 的骰子:15.34 TC 的骰子:17.28 WT 的骰子:15.90 我非常感谢您对可能导致这种差异的原因的见解或建议。我可能需要正确研究或实施模型配置或训练过程的某个特定方面?您在这件事上的指导对我的研究将是无价的。

感谢您的时间和帮助。

诚挚的问候,Yousef Sadegheih

Dear blogger, I am a deep learning enthusiast, I didn't debug the code released by the author, I saw that you debugged it, can you share a copy of the complete code with me? My email is 3034964892@qq.com, thank you so much!

Liyanglzjt commented 7 months ago

嗨,Sadegheih,

感谢您的关注。对于延迟回复,我深表歉意。我很想知道你的训练损失和验证骰子是否正常。我没有看到模型配置的明显差异。因此,我可能怀疑性能可能会受到数据预处理的影响。您能否详细说明您如何处理训练和测试数据?这些信息将帮助我评估是否存在任何差异。

Dear blogger, I am a deep learning enthusiast, I didn't debug the code released by the author, I saw that you debugged it, can you share a copy of the complete code with me? My email is 3034964892@qq.com, thank you so much!