microsoft / Recursive-Cascaded-Networks

[ICCV 2019] Recursive Cascaded Networks for Unsupervised Medical Image Registration
https://arxiv.org/abs/1907.12353
MIT License
362 stars 87 forks source link

some questions about the paper #10

Closed zhang-qiang-github closed 4 years ago

zhang-qiang-github commented 4 years ago

After reading the paper, I have some questions. It is appreciated if you could spend some time on these questions:

  1. what do you mean by "shared-weight cascading"? In my understanding, the 'shared-weight cascading' may represent that the weight in all the subnetworks is the same, and the weight is trained. However, the paper said: "The reason we do not use shared-weight cascading in training is that shared-weight cascades consume extra GPU memory ....", thus, the weight among different subnetworks is different? It really make me confused.

  2. I am confused about the difference between VTN and this paper. They all use cascade method.

zsyzzsoft commented 4 years ago
  1. In the paper, we said that "Those cascades may learn different network parameters on each, since one cascade is allowed to learn a part of measurements or perform some type of alignment specifically." Shared-weight cascading means we repeat each trained cascade respectively during testing.

  2. The main contribution of this paper is the general concept of the unsupervised learning of deep cascades and progressive alignments.

A closer look to the paper is favoured.

zhang-qiang-github commented 4 years ago

Could you please give more details about the difference of this paper and the VTN? The VTN network also use cascades in registration.

zsyzzsoft commented 4 years ago

"DLIR [19] and VTN [37] also stack their networks, though both limited to a small number of cascades. DLIR trains each cascade one by one, i.e., after fixing the weights of previous cascades. VTN jointly trains the cascades, while all successively warped images are measured by the similarity compared to the fixed image. Neither training method allows intermediate cascades to progressively register a pair of images. Those noncooperative cascades learn their own objectives regardless of the existence of others, and thus further improvement can hardly be achieved even if more cascades are conducted. They may realize that network cascading possibly solves this problem, but there is no effective way of training deep network cascades for progressive alignments." "The difference between our architecture and existing cascading methods is that each of our cascades commonly takes the current warped image and the fixed image as inputs (in contrast to [30, 45]) and the similarity is only measured on the final warped image (in contrast to [19, 37]), enabling all cascades to learn progressive alignments cooperatively." Refer to the paper please.