facebookresearch / silk

SiLK (Simple Learned Keypoint) is a self-supervised deep learning keypoint model.
GNU General Public License v3.0
636 stars 57 forks source link

Scannet evaluation questions #57

Closed AliYoussef97 closed 8 months ago

AliYoussef97 commented 9 months ago

Hello!

Sorry for asking too many questions lately.

I had a few questions regarding the Scannet Indoor Pose & Point Cloud registration evaluations.

Relative Pose Estimation (Table 5. in the paper):

Pairwise 3D point-cloud registration (Table. 6 in the paper):

Thank you!

Edit:

For the HPatches repeatability evaluation, I am not quite sure why the true_warped_poitns are warped with the true_homography once more here? since they are already warped with the true_homography using keep_true_points at the line before?

gleize commented 8 months ago

Hi @AliYoussef97.

Sorry for the late reply. Just came back from a long vacation.

I am not completely sure which Scannet test split is used, whether it is v1 test split or v2.

For this experiment, I ran the evaluation pipeline of LoFTR (which is the eval described in SuperGlue). In their paper, they mention using 1.5k test pairs. So it doesn't seem they use any of the official split.

It was not mentioned in the paper whether the images were resized during the evaluation, however I assume they were resized?

The images are indeed resized.

Regarding this statement in the paper "We report pose error AUC at thresholds (5°,10°,20°) using 20k keypoints and RANSAC threshold :5.", does that mean that the keypoints were not normalised using the intrisic matricies and Ransac's threshold was not set relative to the mean of the intrisic matricies?

Yes here and here.

In table 5., it states SiLK + MNN, does that mean the distances were not calcualted using double_softmax or ratio_test and the distance was (1 - dot(desc_0, desc_1.T) along with cross checking only?

Yes, exactly.

URR uses Scannet v1 test split, I assume this is what SiLK evaluates on as well?

Yes.

The paper states that you use ratio_test MNN, and the image size during the evaluation is set to 146 based on the provided evaluation markdown in the repository, however, I am not sure why the dense descriptors were descaled here?

From memory, this was to normalize the descriptors to have length of 1. It would have been better to set the descriptor_scale_factor to 1 and not divide later though. descriptor_scale_factor only matters when using the double softmax formulation (since it affects the temperature), however, since in this case we do use the ratio test, that scale factor would end up being cancelled out when computing the ratio.

AliYoussef97 commented 8 months ago

Hello @gleize,

Thank you so much! Hope you had a great vacation!

Edit:

In their paper, they mention using 1.5k test pairs. So it doesn't seem they use any of the official split.

I have noticed that LofTR and SuperGlue use ScanNet's v2 test split sampled each 15 frames, if that is what you meant by they are not using the official split?