Dootmaan / MT-UNet

Official Code for *Mixed Transformer UNet for Medical Image Segmentation*
MIT License
160 stars 26 forks source link

Why not add a MLP? #20

Closed Chenguang-Wang closed 2 years ago

Chenguang-Wang commented 2 years ago

Why not add a MLP after computing the various attentions?

Chenguang-Wang commented 2 years ago

I have another question. Because my environment is different from yours. I want to know how many cards you used and how much time it took. Thanks!

Dootmaan commented 2 years ago

Hi @Chenguang-Wang and thank you for your question.

The experiment is conducted with a single GTX 1080Ti (11G) and the training takes about 3-5 hours on ACDC but much longer on Synapse (maybe 8-12 hours or more im not so sure rn).

Chenguang-Wang commented 2 years ago

@Dootmaan Thank you for your answer. Why not add a MLP after computing the various attentions?

Dootmaan commented 2 years ago

If you are interested in using MLP after the original cancatenation, maybe you could try to do it in MLP-Mixer way. By performing token mixing followed by a channel mixing, the local attention map can be properly mixed up with less computational cost. However, in our paper, since MTM is already a global-wise operation (just like MLP), we thought it may be not necessary to adiditionally using an MLP layer.

Chenguang-Wang commented 2 years ago

OK, thank you for your answering.

Chenguang-Wang commented 2 years ago

请问下, ACDC的划分和Transunet和SwinUnet一致吗?

Dootmaan commented 2 years ago

请问下, ACDC的划分和Transunet和SwinUnet一致吗?

Yes we used the same split for TransUnet and Swin-Unet, and that's why we have to rerun all the experiments on ACDC. As far as we know, Swin-Unet itself uses a different split on ACDC since the authors of TransUnet didnt provide the preprocessed ACDC dataset.

Dootmaan commented 2 years ago

This issue is closed since no further activity has happened for a while.