MrGiovanni / ModelsGenesis

[MICCAI 2019 Young Scientist Award] [MEDIA 2020 Best Paper Award] Models Genesis
Other
737 stars 140 forks source link

Some problems of the paper #2

Closed argman closed 5 years ago

argman commented 5 years ago

Tks for the great job. But from my side, i have serveral questions

  1. In the paper, it doesn't compare result a simple auto-encoder without your transforms, so maybe an autoencoder can get comparable result ?
  2. The proposed transforms can also be applied as data augmentations during training, so how does this perform compared with the pretrained model ?
MrGiovanni commented 5 years ago
  1. A simple auto-encoder does not offer comparable performance--similar or even worse than random initialization in the target tasks. As V-Net, an encoder-decoder architecture with skip connections in between, was adopted in this work, the skip connection may make the auto-encoder task a trivial solution during image restoration. Thereby, although we had the results of auto-encoder pre-training, we did not report or discuss in the paper.

  2. In practice, unlimited input-output pairs could be generated during pre-training because (1) the input cubes are randomly generated from patient CT scans, both location and size and (2) the transformations involve large scale randomness in terms of their parameters. So we do not see the purpose to add more data augmentation in favor of the training sample variety, but instead, we are more focusing on how to design more meaningful image deformations for this self-supervised learning framework.

Thank you for the questions.

Zongwei

argman commented 5 years ago

In 2 , i think i doesn't make it clear, I mean the transformations can be added as data augmentation when training model from scratch , not in pretraining

MrGiovanni commented 5 years ago

Yes, for the various target tasks, any data augmentation can be considered. In our paper, we adopted the most common strategies (i.e., translation, rotation, scaling, etc.) as the data augmentation for target tasks, yet we did not apply the four transformations as proposed in the paper. I think you raised a good point--by introducing image transformations the same as the pre-training task may indeed increase the target task performances. Thank you!