Closed argman closed 5 years ago
A simple auto-encoder does not offer comparable performance--similar or even worse than random initialization in the target tasks. As V-Net, an encoder-decoder architecture with skip connections in between, was adopted in this work, the skip connection may make the auto-encoder task a trivial solution during image restoration. Thereby, although we had the results of auto-encoder pre-training, we did not report or discuss in the paper.
In practice, unlimited input-output pairs could be generated during pre-training because (1) the input cubes are randomly generated from patient CT scans, both location and size and (2) the transformations involve large scale randomness in terms of their parameters. So we do not see the purpose to add more data augmentation in favor of the training sample variety, but instead, we are more focusing on how to design more meaningful image deformations for this self-supervised learning framework.
Thank you for the questions.
Zongwei
In 2 , i think i doesn't make it clear, I mean the transformations can be added as data augmentation when training model from scratch , not in pretraining
Yes, for the various target tasks, any data augmentation can be considered. In our paper, we adopted the most common strategies (i.e., translation, rotation, scaling, etc.) as the data augmentation for target tasks, yet we did not apply the four transformations as proposed in the paper. I think you raised a good point--by introducing image transformations the same as the pre-training task may indeed increase the target task performances. Thank you!
Tks for the great job. But from my side, i have serveral questions