nianticlabs / mickey

[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
https://nianticlabs.github.io/mickey/
Other
452 stars 28 forks source link

visualization #1

Closed wuqun-tju closed 2 months ago

wuqun-tju commented 6 months ago

hello, Could you provide visuliaztion code generating Fig6 in the paper

wuqun-tju commented 6 months ago

Besides, I found the result is multiple when runs inference_demo.py multitimes with the same inputs。Is there some config to set ? like seed?

axelBarroso commented 6 months ago

Hello,

Thank you for the interest in our work!

Could you provide visuliaztion code generating Fig6 in the paper

Yes, we plan to add visualization code to the repo. That might take a bit though, I will ping this issue when ready. In the meantime, I suggest looking other visualization code already out there that could serve you as a guide: ACE, or DUSt3R.

I found the result is multiple when runs inference_demo.py multitimes with the same inputs

The output of the network is deterministic, but the pose solver is not. We had to rely on a probabilistic solver so that we could train our pipeline end-to-end. If you want your results more deterministic, I would suggest replacing the probabilistic Procrustes for some other solver. In the map-free repo you can find a few.

wuqun-tju commented 6 months ago

Thank you for your reply! The difference is significant among multi results of probabilistic solver. If the output of the network is same, the results of probabilistic solver may diff slightly. But I test 3 times, the deviation among them is more than 1 degree

wuqun-tju commented 6 months ago

Besides, In the paper map-free_reloc, it is mentioned that Orthogonal Procrustes solver is worse, right? Lookforward your reply!

axelBarroso commented 6 months ago

Yes, you are right, the difference between runs for a single image pair could be more than 1 degree. We saw that when running the network over a large test set (Map-free test images), the results were averaged out and were quite constant.

However, if you are interested in having more deterministic results (which I agree is ideal for some applications), I would suggest changing the solver.

In Table 6 from the supplementary, we show results for different solvers and see that results are comparable. For instance, the PnP solver (proposed in Map-free) reached a better pose AUC and slightly worse VCRE-AUC.

The main source of randomness in our solver comes from the fact that we sample from the whole matching matrix. We experimented with a more stable approach by following standard practices:

This limits much more the candidates that the network will sample to generate the poses, and produces more stable results.

As a note, this resulted in very similar results as the PnP solver reported in the supplementary material. See the AUC scores: 0.73 (VCRE) - 0.34 (Pose).

Hope this helps!

def get_top_kpts(self, kpt, depth, scr, dsc, desired_kpts=-1):

        B, _, num_kpts = kpt.shape

        if desired_kpts == -1:
            desired_kpts = num_kpts

        # Use scores to select top K keypoints
        scores_idx = torch.argsort(scr, dim=2, descending=True)[:, :, :desired_kpts].reshape(B * desired_kpts)
        batch_idx = torch.tile(torch.arange(B).view(B, 1), [1, desired_kpts]).reshape(B * desired_kpts)

        kpt = kpt[batch_idx, :, scores_idx].view(B, desired_kpts, 2).transpose(2, 1)
        depth = depth[batch_idx, :, scores_idx].view(B, desired_kpts, 1).transpose(2, 1)
        scr = scr[batch_idx, :, scores_idx].view(B, desired_kpts, 1).transpose(2, 1)
        dsc = dsc[batch_idx, :, scores_idx].view(B, desired_kpts, self.dsc_dim).transpose(2, 1)

        return kpt, depth, scr, dsc
wuqun-tju commented 6 months ago

Thank you for your reply ! Sorry for confusions as follows

  1. “We saw that when running the network over a large test set (Map-free test images), the results were averaged out and were quite constant”, (1) you mean that the matches ,keypoints, scores of the network are quite constant every time for same inputs? (2) Or you mean the solver results are constant because R and t among all test sets were averaged, so the average is constant, not the one image pair's R and t is constant . Which one(I said) is right
  2. Thank you for the code posted, how do you use it , I mean this strategy was applied in train step and test step , or just in test? or just in train?
axelBarroso commented 6 months ago

Hello!

  1. Sorry for the confusion, I meant (2):

the solver results are constant because R and t among all test image pairs were averaged, and hence, the averaged results are constant. Not the one image pair's R and t is constant.

  1. This strategy is only applied during testing. That code is only meant to select top-scoring keypoints before doing the matching. That, together with the Mutual Nearest-Neighbour check, should help getting more stable results. (although as stated above, results are similar to the probabilistic solver, ie, slightly lower VCRE but higher Pose).
wuqun-tju commented 6 months ago

OK, thank you for your quick reply. About the top 50% strategy, you mean this strategy is applied to other solver like pnp, rather than probabilistic solver.

axelBarroso commented 6 months ago

We applied the 50% strategy to the PnP and also the probabilistic solver. See the AUC results below:

In the main results of the paper, we did NOT use the 50% strategy but all the keypoints. Using fewer keypoints can help with more stable results, but we did not see any indicator that this was better than using our probabilistic solver in the Map-free benchmark.

ttsesm commented 5 months ago

@axelBarroso any update on the visualization code? I am interested on the snippet code that you visualize the correspondences from the image to the 3d points.

s00468_MicKey

axelBarroso commented 5 months ago

Hello! We are trying to make the visualisation code easy to use - it might take a bit longer since we are also preparing all the CVPR materials first. Sorry for the delay, and thank you for the interest!

JayKarhade commented 2 months ago

Hi @axelBarroso were there any updates on the code for visualization of the 3d point correspondences? It'd be really awesome to check this!

axelBarroso commented 2 months ago

Hello,

We just pushed the visualization code (finally!). If still interested, give it a try. You can use it by setting the flag generate_3D_vis to True in the demo_inference.py script.

The code will generate an image like the one attached. Let me know if you find any issues.

Hope this is helpful. 3d_vis