XuyangBai / D3Feat

[TensorFlow] Official implementation of CVPR'20 oral paper - D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features https://arxiv.org/abs/2003.03164
MIT License
259 stars 38 forks source link

Training without GT transformation #32

Closed kidpaul94 closed 3 years ago

kidpaul94 commented 3 years ago

Hello again, Is it possible to train D3Feat without GT transformation data like how PPF-FoldNet does....?

XuyangBai commented 3 years ago

Hi, one possible way is to use the transformed point cloud as the pseudo point cloud pairs ( e.g. you have point cloud A, you can add augmentation like Rotation, Translation, Gaussian Noise, Occlusion, etc to get another point cloud B, then you could use the A & B to generate the point-point correspondences for training) but for my experience, the results will be heavily affected by the augmentation strategies and generally perform worse than the real point cloud pairs

kidpaul94 commented 3 years ago

Thank you for your response. If that's the case, I should probably make some program for myself so that I can manually find transformation of another point cloud to the reference point cloud relatively easily. I presume that the dataset folder will just contain pair of pc.ply and pose.npy. How many training data roughly does this network need for single class (I just know pointnet does not require that many unlike Mask-RCNN or YOLO)? Excuse this newbie trying to collect ideas about this specific type of neural network. Again, thank you a lot!

XuyangBai commented 3 years ago

If that's the case, I should probably make some program for myself so that I can manually find transformation of another point cloud to the reference point cloud relatively easily. I presume that the dataset folder will just contain pair of pc.ply and pose.npy.

Yes, you may also look at some direct registration papers (e.g. DCP, RPM-Net, they are focusing on the registration of small objects from ModelNet40, and their training data is just generated by adding augmentation to the point cloud if I remember correctly, so it is quite similar with your situation.) And you can change the dataset class by 1. read a point cloud 2. sample some augmentation and apply to the point cloud to get the transformed one 3. compute point-point correspondences. In this way you do not need to save the pose.npy and the poses are generated on the fly. In this way you can also play with different augmentation strategies without generating data every time.

How many training data roughly does this network need for single class (I just know pointnet does not require that many unlike Mask-RCNN or YOLO)?

It is hard to say, but generally, it will need more data if your training data is generated by adding augmentation instead of real point cloud pairs (as"self-supervised" models are usually data-hungry). But since you can generate infinite data by adding augmentation, you may keep sampling augmentation to generate new data for training untill the network converge? I have tried such experiments longtime ago, and I find the network is very easy to overfit the training data but the performance on real data is not satisfactory.

kidpaul94 commented 3 years ago

Thank you! I'll close this issue for now.

joker-lb7 commented 1 year ago

If that's the case, I should probably make some program for myself so that I can manually find transformation of another point cloud to the reference point cloud relatively easily. I presume that the dataset folder will just contain pair of pc.ply and pose.npy.

Yes, you may also look at some direct registration papers (e.g. DCP, RPM-Net, they are focusing on the registration of small objects from ModelNet40, and their training data is just generated by adding augmentation to the point cloud if I remember correctly, so it is quite similar with your situation.) And you can change the dataset class by 1. read a point cloud 2. sample some augmentation and apply to the point cloud to get the transformed one 3. compute point-point correspondences. In this way you do not need to save the pose.npy and the poses are generated on the fly. In this way you can also play with different augmentation strategies without generating data every time.

How many training data roughly does this network need for single class (I just know pointnet does not require that many unlike Mask-RCNN or YOLO)?

It is hard to say, but generally, it will need more data if your training data is generated by adding augmentation instead of real point cloud pairs (as"self-supervised" models are usually data-hungry). But since you can generate infinite data by adding augmentation, you may keep sampling augmentation to generate new data for training untill the network converge? I have tried such experiments longtime ago, and I find the network is very easy to overfit the training data but the performance on real data is not satisfactory.

Great suggestion, but I found that these data sets contain overlap rates, how do I calculate the overlap rates in the self-made data sets? Thanks!