verlab / accelerated_features

Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
https://www.verlab.dcc.ufmg.br/descriptors/xfeat_cvpr24
Apache License 2.0
940 stars 95 forks source link

Xfeat versus SIFT - examples of when to use which? #63

Open AruniRC opened 1 month ago

AruniRC commented 1 month ago

Firstly, thank you for your work, @guipotje et al.

There are issues like https://github.com/verlab/accelerated_features/issues/59 that show SIFT performing better than vanilla XFeat for large geometric transformations, or when there is a large scale difference in objects being matched between images (https://github.com/verlab/accelerated_features/issues/58).

1) In this case, what are the cases where, in practice, one should prefer XFeat over SIFT? Can you share more examples of cases and some prescriptive use cases where XFeat surpasses SIFT? For example, maybe in cases of motion blur in videos, or lighting inconstancy, I am guessing XFeat may be better than SIFT - could you please share such analysis, if you have done this. I am particularly mentioning SIFT since it is a de facto standard for non-earned features. 2) Steerable-XFeat (https://github.com/verlab/accelerated_features/issues/32) - do you have any plans to integrate the Steerers work into the Xfeat codebase with pretrained networks?

Thank you.

guipotje commented 2 weeks ago

Hello @AruniRC , sorry for the delayed reply, and thanks for the interest in our work!

  1. I don't have more quantitative experiments other than those reported in the paper, but from my experience with XFeat, I will try to give some directions for further exploration: When it comes to large geometric transformations, you have the option to perform multi-scale feature extraction. This is similar to what's done in SIFT, ORB and other classic approaches. This should increase XFeat invariance to strong scale changes. Compared to handcrafted features, XFeat has the following advantages: (a) Improved robustness to photometric changes (illumination, shadows, etc), and blur, as shown in the demo; (b) The ability to be easily fine-tuned for specific problems; (c) Neural processing hardware is now ubiquitous and is becoming cheaper and more accessible everyday, and provide highly optimized inference for CNN and other common network architectures. Thus, you can deploy XFeat in hardware-constrained devices, probably obtaining much better computational performance than general-purpose implementations of SIFT, ORB, & friends on cheap hardware such as mobile processors (ARM, etc).

  2. Yes, It would be very nice to provide an implementation of steerers in this repo. Maybe someone could help providing a tight integration between them.