yewzijian / RegTR

End-to-end Point Cloud Correspondences with Transformers
MIT License
224 stars 28 forks source link

About the influence of the weak data augmentation #4

Closed qinzheng93 closed 2 years ago

qinzheng93 commented 2 years ago

Thanks for the great work. I notice that RegTR adopts a much weaker augmentation than the commonly used augmentation in [1, 2, 3]. How does this affect the convergence of RegTR? And will the weak augmentation affect the robustness to large transformation perturbation? Thank you.

[1] Bai, X., Luo, Z., Zhou, L., Fu, H., Quan, L., & Tai, C. L. (2020). D3feat: Joint learning of dense detection and description of 3d local features. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6359-6367). [2] Huang, S., Gojcic, Z., Usvyatsov, M., Wieser, A., & Schindler, K. (2021). Predator: Registration of 3d point clouds with low overlap. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition (pp. 4267-4276). [3] Yu, H., Li, F., Saleh, M., Busam, B., & Ilic, S. (2021). Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration. Advances in Neural Information Processing Systems, 34, 23872-23884.

yewzijian commented 2 years ago

Hi @qinzheng93, you are correct in the observation. I used a smaller augmentation because the network considers the point positions (through the positional encoding in the attention layers). The network can use this information to infer certain properties which are not possible in local-only descriptors, e.g. a region might be empty because it is occluded by points nearer to the camera.

I recently performed some tests to understand the behavior against larger transformation perturbations of our trained model. On the 3DMatch test set, compared to e.g. predator, our model performs better for pairs with smaller transformation differences, but worse for pairs with large transformation differences. This may be a result of the weaker augmentation, but also may be due to training data bias (there's more training pairs with smaller transformations).

qinzheng93 commented 2 years ago

@yewzijian Thanks for your patient replying. That’s interesting. How does RegTR perform under the standard augmentation?

yewzijian commented 2 years ago

Hi @qinzheng93,

On ModelNet we do use the standard (larger) augmentations and obtain great results, but this is also because the train/test conditions are similar. For 3DMatch, I do not have a definite answer since I have not tried it, but I suspect it might perform worse.

qinzheng93 commented 2 years ago

@yewzijian Thanks again. And look forward to your future work!