XuyangBai / D3Feat

[TensorFlow] Official implementation of CVPR'20 oral paper - D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features https://arxiv.org/abs/2003.03164
MIT License
259 stars 38 forks source link

Pretrained Model with voxel = 0.05 #31

Closed BiaoBiaoLi closed 2 years ago

BiaoBiaoLi commented 3 years ago

Thanks your work! I change the code like below, but i cannot get the same result like voxel = 0.03. I want to know if something else should be changed. Thanks!

The release weight is pretrained on 3DMatch dataset, where we use a voxel downsample with 0.03m.

    ## If you want to test your own point cloud data which have different scale, you should change the scale
    ## of the kernel points so that the receptive field is also enlarged. For example, when testing the
    ## generalization on ETH, we use the following code to rescale the kernel points.

` for v in my_vars: if 'kernel_points' in v.name: rescale_op = v.assign(tf.multiply(v, 0.05 / 0.03))

print('kernel_points', rescale_op)

            self.sess.run(rescale_op)

``

XuyangBai commented 3 years ago

Hi @BiaoBiaoLi

Could you specify which dataset you are using? Basically, if you want to test the release model on another dataset, you need to 1) use the hard selection strategy by uncommenting here. 2) change the voxel size and the scale of the kernel points according to the scale of your new dataset

BiaoBiaoLi commented 3 years ago

Thanks! I use the redwood dataset. According to your suggestion, i change the code like below, but i cannot get the same result like voxel = 0.03. 1) I use the hard selection strategy like below. It only change the score of each point, Whether I modify this or not, if voxel = 0.03, the model can get a good result. local_max = tf.reduce_max(neighbor_features, axis=1) is_local_max = tf.equal(features, local_max) is_local_max = tf.Print(is_local_max, [tf.reduce_sum(tf.cast(is_local_max, tf.int32))], message='num of local max') detected = tf.reduce_max(tf.cast(is_local_max, tf.float32), axis=1, keepdims=True) score = score * detected 2) I change the voxel size and the scale of the kernel points according to the scale of the new dataset by these code in demo.

Initiate dataset configuration

1. dataset = MiniDataset(files=point_cloud_files, voxel_size=0.05)
2. for v in my_vars:
        if 'kernel_points' in v.name:
            rescale_op = v.assign(tf.multiply(v, 0.05 / 0.03))
            self.sess.run(rescale_op)
XuyangBai commented 3 years ago

Hi, according to you description D3Feat should work quite well since redwood is also an indoor RGBD dataset which is similar to our training set. Could you please send some failure cases to me for debugging? BTW, I also think there is no need to change the voxel size and scale for redwood.

BiaoBiaoLi commented 3 years ago

Thanks! I am also confused about this. When the voxel = 0.03, the results are good. I test the fragment 13 and 14 of loft with voxel = 0.05, while voxel = 0.03, the result is good. http://redwood-data.org/indoor_lidar_rgbd/download.html I just want to test the results of different resolution of the model.

XuyangBai commented 3 years ago

Hi, the voxel size is indeed one of the most critical hyper-parameter for our method because it affects the receptive field of our descriptors and the sparsity of input point cloud. And for some fragments, 5cm is too big for some thin structures in the indoor scene. I have seen the fragment 13 and 14, which contains a large surface and relatively few geometric-distinguishable regions, so using 5cm might also result in too few useful keypoints thus too few inlier correspondences for registration.

BiaoBiaoLi commented 3 years ago

Oh, Thanks! I will test this model on all fragments of this dataset. In the section 6.3. Outdoor settings: ETH Dataset, Do you train model with the different voxel size to experience or use the hard selection strategy and change the voxel size and the scale of the kernel points? and in the section 6.1. Indoor Settings: 3DMatch dataset, for FCGF, do you use the model trained on 5cm voxel size or 2.5cm voxel size? Thanks!

XuyangBai commented 3 years ago

In section 6.3, we evaluation the generation ability of D3Feat, so we use the model trained on 3DMatch (with 3cm voxel size) and evaluated under different voxel size and scale to get the numbers in Table 5. For FCGF, we use FCGF with 2.5 cm voxel size.

BiaoBiaoLi commented 3 years ago

Thanks! in section 6.1 and 6.2, I want to know how to choose the keypoints of FCGF and other methods?

XuyangBai commented 3 years ago

Since other methods do not have a keypoint dector, we select keypoints randomly for them.

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: BiaoBiaoLi @.> 发送时间: Thursday, March 25, 2021 1:08:31 AM 收件人: XuyangBai/D3Feat @.> 抄送: Xuyang BAI @.>; Comment @.> 主题: Re: [XuyangBai/D3Feat] Pretrained Model with voxel = 0.05 (#31)

Thanks! in section 6.1 and 6.2, I want to know how to choose the keypoints of FCGF and other methods?

― You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/XuyangBai/D3Feat/issues/31#issuecomment-806002466, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADTWVCJCHRKSZIRJTNWPDH3TFIMA7ANCNFSM4ZNDAE2Q.

BiaoBiaoLi commented 3 years ago

Thanks! In in your experience, Do you use a RANSAC with 50,000 max iterations like the demo code ? or repalce the 4000000 to 50,000 500 the same and the distance_threshold = 0.05? result = open3d.registration_ransac_based_on_feature_matching( src_keypts, tgt_keypts, src_desc, tgt_desc, distance_threshold, open3d.TransformationEstimationPointToPoint(False), 4, [open3d.CorrespondenceCheckerBasedOnEdgeLength(0.9), open3d.CorrespondenceCheckerBasedOnDistance(distance_threshold)], open3d.RANSACConvergenceCriteria(4000000, 500))

XuyangBai commented 3 years ago

Yes I use RANSAC with 50000 max iterations for both 3DMatch and KITTI.

BiaoBiaoLi commented 3 years ago

Thanks! In general, what are the methods for selecting the distance_threshold = 0.05 ?

XuyangBai commented 3 years ago

You mean why I choose 0.05 as the distance_threshold? This is a hyperparameter for RANSAC you may play with, you should consider the voxel size, the sensor noise, the scale of the scene, etc. I think there is no golden criterion for selecting this distance threshold.

BiaoBiaoLi commented 3 years ago

Thanks!!! I want to know how to get the ground truth in line 288 KITTI.py about these code? M = (self.velo2cam @ positions[0].T @ np.linalg.inv(positions[1].T) @ np.linalg.inv(self.velo2cam)).T Why not be np.linalg.inv(self.velo2cam @ positions[0] ) @ self.velo2cam @ positions[1]? Thanks!!!

XuyangBai commented 3 years ago

M = (self.velo2cam @ positions[0].T @ np.linalg.inv(positions[1].T) @ np.linalg.inv(self.velo2cam)).T <==> M = inv(velo2cam).T @ inv(pos[1].T).T @ pos[0] @ velo2cam.T <==> M = inv(pos[1] @ velo2cam.T) @ (pos[0] @ velo2cam.T)

If you change your computation to np.linalg.inv(positions[0] @ self.velo2cam.T) @ positions[1] @ self.velo2cam.T then you will get the inverse of our result (which is the reversed transformation), currently your computation got nothing meaningful I think. Note that there is a transpose operator when constructing the velo2cam matrix. I mainly borrow the conversion from FCGF.

BiaoBiaoLi commented 3 years ago

Thanks!!
I get it, there is a transpose operator when constructing the velo2cam matrix. Is self.velo2cam.T the transformation of coordinate system? M2 = M @ reg.transformation, Why not be reg.transformation @ M? Thanks!!