cwmok / C2FViT

This is the official Pytorch implementation of "Affine Medical Image Registration with Coarse-to-Fine Vision Transformer" (CVPR 2022), written by Tony C. W. Mok and Albert C. S. Chung.
MIT License
138 stars 5 forks source link

affine_para #20

Open havecats opened 5 months ago

havecats commented 5 months ago

Thank you for your excellent work! While I have noticed some previous discussions, I still have a few questions.

I observed that during the model evaluation phase, you use the affine transformation parameters from the final stage output (affine_para_list[-1]). However, the affine_para parameters from the three stages are not cumulatively multiplied to obtain the final parameters. Could this result in a discrepancy between affine_para_list[-1] and the corresponding transformation for the final warpped_x_list[-1]? Have you attempted to accumulate the affine_para parameters? My understanding might be incorrect.

For my personal skeletal dataset, I aim to perform affine transformation registration. Due to the abundance of tissue, I first remove the areas outside the skeleton based on labels before registration. In your view, is this approach reasonable in deep learning? Would this still be considered unsupervised learning? Additionally, would it be more reasonable to adopt a semi-supervised strategy (including the label's DSC in the loss function)?

Thank you again for your outstanding work and detailed response.

cwmok commented 5 months ago

Hi @havecats,

Could this result in a discrepancy between affine_para_list[-1] and the corresponding transformation for the final warpped_x_list[-1]?

Good question. That's why we added a skip connection in between different stages. We believe the features passed by the skip connection can address this discrepancy because the features are computed from the previous stage, which includes the F and M.

Have you attempted to accumulate the affine_para parameters?

Yes, we did. We tried adding the affine_para_list(like LapIRN) or using composition to join affine_para_list. But the result was not as good as using only the affine_para_list at the final stage.

Due to the abundance of tissue, I first remove the areas outside the skeleton based on labels before registration.

The critical question is how this label was generated. If this label is generated manually or via a supervised model, it may be unsuitable to call it unsupervised learning.

In your view, is this approach reasonable in deep learning?

Yes, it is reasonable. In fact, using a mask to mask out irrelevant content during registration is a common practice in the literature of registration. See this for example.

Additionally, would it be more reasonable to adopt a semi-supervised strategy (including the label's DSC in the loss function)?

If you are in favor of registration accuracy and the target labels are available, I think it is a good point to use it. This is one of the huge advantages of learning-based methods over the traditional image registration methods.