MIC-DKFZ / nnUNet

Apache License 2.0
5.71k stars 1.73k forks source link

Cross Modality Learning for MR segmentation #399

Closed MengzhangLI closed 3 years ago

MengzhangLI commented 3 years ago

Guten tag, Fabian. Long time no see. ;)

Can I ask you some problems about MR segmentation of different modalities?

In Task01 Brain Tumour / BRATS dataset, there are four modalities: t1、t2、flair、t1ce. T1ce is very important for tumour segmentation because its contrast media could make enhancing/non-enhancing tumour more clearly. However, some challenges left: (1) contrast media is not cheap and harmful (2) patients need to take another scans after T1/T2.

Right now I'm interested about MR segmentation with less modality (t1, t2 only) without t1ce. From the scans we can see although region of tumour in T1, T2 scans are not as clear as T1ce, the shape and texture of tumours can be found if we look carefully.

Do you think this topic is worth investigating? To be honest, I don't like several attempts using semi-supervised learning: for example, using GAN generating pseudo T1ce images or transfer learning for domain adaptation. It might be more important learning relations between T1/T2 and T1ce (or other modality needing contrast media) directly.

If certain cross modality representation learning method works, I think it might be very useful in MR segmentation/classification tasks because of contrast media is widely used. Unfortunately, till now I still didn't find any related or useful paper in this topic. Did you read some papers solving for cross modality tasks?

BTW, I just read your paper about robustness MR semantic segmentation in ARXIV(https://arxiv.org/pdf/2011.07592.pdf). It is very great. ;)

Best,

Mengzhang LI

MengzhangLI commented 3 years ago

Dear Fabian,

I noticed your excellent work is received by Nature Methods officially yesterday! Your great framework totally deserves it.

Congrats!

Best,

Mengzhang Li

FabianIsensee commented 3 years ago

Dear Mengzhang,

apologies for the late reply. We are all super happy that the paper finally got accepted :-) What a ride!

Regarding your question about the modalities I would say that the easiest way to get the answer you are looking for is to just try training the BraTS models without the T1c input sequence. This should be very easy to set up and you would know in about 2-3 days ;-) My recommendation would be to use the Task082, not Task001. Task001 has a prettty bad gt quality.

I fully agree with you that GAN-based approaches to generate missing modalities are a bit sketchy: you never have any guarantees that the model will actually produce the desired output and (in contrary to segmentation methods) you have no way of double chaking as a human. The problem is that this limitation will also apply to your endeavour: If you omit the T1c sequence, there is no way of knowing that your model will produce the desired (=correct) output at inference time. You would need to run extremely rigorous evaluation to convince reviewers and are even less likely to convince clinicians. I am not saying that you should not try this, but I am rather sceptical about it

Best,

Fabian

MengzhangLI commented 3 years ago

Dear Mengzhang,

apologies for the late reply. We are all super happy that the paper finally got accepted :-) What a ride!

Regarding your question about the modalities I would say that the easiest way to get the answer you are looking for is to just try training the BraTS models without the T1c input sequence. This should be very easy to set up and you would know in about 2-3 days ;-) My recommendation would be to use the Task082, not Task001. Task001 has a prettty bad gt quality.

I fully agree with you that GAN-based approaches to generate missing modalities are a bit sketchy: you never have any guarantees that the model will actually produce the desired output and (in contrary to segmentation methods) you have no way of double chaking as a human. The problem is that this limitation will also apply to your endeavour: If you omit the T1c sequence, there is no way of knowing that your model will produce the desired (=correct) output at inference time. You would need to run extremely rigorous evaluation to convince reviewers and are even less likely to convince clinicians. I am not saying that you should not try this, but I am rather sceptical about it

Best,

Fabian

Hi, Fabian. Thanks for your reply.

The problem is interesting and deserves to do some research: in the past two weeks, I made several experiments on Task01 Brain Tumour (Dice on validation of nnUNet 3d_fullres: T1+T2+Flair ~0.7 and T1+T2+Flair+T1ce ~0.8). At least, the gap between 3 and 4 modality is obvious.

Right now, my idea is given neural network certain prior knowledge, i.e., the network may experience T1ce and other modalities together in training, it may extract certain correlation between different modalities which would improve its performance in inference where only T1/T2/Flair provided.

If T1ce is available in training procedure, would it sweep part of your doubt about its clinical value and its actual performance? (BTW, rigorous evaluation is must.)

Best,

Mengzhang LI

FabianIsensee commented 3 years ago

Dear @MengzhangLI, from my understanding it does not matter how to approach the problem. Either by generating the missing modality or by training a segmentation network to predict a class that is cannot see: in both cases you are expecting the network to 'see' something that is not there. Both approaches carry the same risk and will probably never get close to the performance and reliability of a network that has access to the T1ce image.

If T1ce is available in training procedure, would it sweep part of your doubt about its clinical value and its actual performance? (BTW, rigorous evaluation is must.)

I am not sure how that would work because you are still confronted with the above mentioned problem at inference time. Best, Fabian

MengzhangLI commented 3 years ago

Dear Fabian:

Thanks for your reply. Right now I'm doing some experiments such as (1) random padding zero on certain sequence(T1, T2, Flair and T1ce), (2) MixUp/ MixMatch on different sequences. I will check carefully during procedure.

Same with you, I will not be surprised if nnUNet can not see something unseen, even if they were provided in training.