Open LXXIANG12 opened 2 months ago
I am new to this field and I would also like to ask can I train with only the homologous transformation data set and not the Megadepth data?
Hello @LXXIANG12, thank you for your interest in XFeat.
XFeat keypoints are distilled from ALIKE, but the keypoint network is exceptionally small.
If I understand correctly, you wish to extract descriptors from a desired input position. This is entirely possible, as XFeat's coarse feature map is dense, allowing interpolation of descriptors at any desired location. This can be achieved using the provided sparse interpolator.
In the next frame, you can focus on the vicinity of the last coordinate from the previous frame. This can be done efficiently by cropping the feature map, for example, into a 5x5xdim patch centered at the coordinate, followed by a fast dot product to extract a heatmap.
For the second question, yes we provided an example by training with fully synthetic data
Best regards, Guilherme.
I have a task, I want to use the coordinates of the center point of the detection box of the target detection as the feature point input, so as to match the next frame image to achieve the tracking effect. How are the feature points of XFeat learned?Methods such as COTR do this, but are inefficient.