verlab / accelerated_features

Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
https://www.verlab.dcc.ufmg.br/descriptors/xfeat_cvpr24
Apache License 2.0
1k stars 114 forks source link

The code performs poorly #49

Open lyfadvance opened 3 months ago

lyfadvance commented 3 months ago

I found that without lightglue, the effect is very poor. 2024-08-20_21-30-47

Besides, when I printed the head map, the effect was also very poor. 2024-08-20_21-32-55 imshow code is

@torch.inference_mode()
    def detectAndCompute(self, x, top_k = None, detection_threshold = None):
        """
            Compute sparse keypoints & descriptors. Supports batched mode.

            input:
                x -> torch.Tensor(B, C, H, W): grayscale or rgb image
                top_k -> int: keep best k features
            return:
                List[Dict]: 
                    'keypoints'    ->   torch.Tensor(N, 2): keypoints (x,y)
                    'scores'       ->   torch.Tensor(N,): keypoint scores
                    'descriptors'  ->   torch.Tensor(N, 64): local features
        """
        if top_k is None: top_k = self.top_k
        if detection_threshold is None: detection_threshold = self.detection_threshold
        x, rh1, rw1 = self.preprocess_tensor(x)

        B, _, _H1, _W1 = x.shape

        M1, K1, H1 = self.net(x)
        mod = torch.jit.trace(self.net, x)
        torch.jit.save(mod,"feature.pt")
        M1 = F.normalize(M1, dim=1)

        #Convert logits to heatmap and extract kpts
        K1h = self.get_kpts_heatmap(K1)

        array1=K1h.numpy()#将tensor数据转为numpy数据
        maxValue=array1.max()
        array1=array1*255#normalize,将图像数据扩展到[0,255]
        mat=np.uint8(array1)#float32-->uint8
        print('mat_shape:',mat.shape)#mat_shape: (3, 982, 814)
        mat=mat[0, : , :, :].transpose(1,2,0)#mat_shape: (982, 814,3)
        cv2.imshow("img",mat)

        array2=H1.numpy()#将tensor数据转为numpy数据
        maxValue=array2.max()
        print(maxValue)
        #array2=array2*255/maxValue#normalize,将图像数据扩展到[0,255]
        array2=array2 * 255#normalize,将图像数据扩展到[0,255]
        mat=np.uint8(array2)#float32-->uint8
        print('mat_shape:',mat.shape)#mat_shape: (3, 982, 814)
        mat=mat[0, : , :, :].transpose(1,2,0)#mat_shape: (982, 814,3)
        cv2.imshow("img2",mat)
        cv2.waitKey()

        mkpts = self.NMS(K1h, threshold=detection_threshold, kernel_size=5)

        #Compute reliability scores
        _nearest = InterpolateSparse2d('nearest')
        _bilinear = InterpolateSparse2d('bilinear')
        scores = (_nearest(K1h, mkpts, _H1, _W1) * _bilinear(H1, mkpts, _H1, _W1)).squeeze(-1)
        scores[torch.all(mkpts == 0, dim=-1)] = -1

        #Select top-k features
        idxs = torch.argsort(-scores)
        mkpts_x  = torch.gather(mkpts[...,0], -1, idxs)[:, :top_k]
        mkpts_y  = torch.gather(mkpts[...,1], -1, idxs)[:, :top_k]
        mkpts = torch.cat([mkpts_x[...,None], mkpts_y[...,None]], dim=-1)
        scores = torch.gather(scores, -1, idxs)[:, :top_k]

        #Interpolate descriptors at kpts positions
        feats = self.interpolator(M1, mkpts, H = _H1, W = _W1)

        #L2-Normalize
        feats = F.normalize(feats, dim=-1)

        #Correct kpt scale
        mkpts = mkpts * torch.tensor([rw1,rh1], device=mkpts.device).view(1, 1, -1)

        valid = scores > 0
        return [  
                   {'keypoints': mkpts[b][valid[b]],
                    'scores': scores[b][valid[b]],
                    'descriptors': feats[b][valid[b]]} for b in range(B) 
               `]`

Can you take a look at what's going on? thank you

guipotje commented 3 months ago

hi @lyfadvance, this result is weird, since we have an example without LightGlue that is working fine. Please check this notebook.