zju3dv / LoFTR

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021, T-PAMI 2022
https://zju3dv.github.io/loftr/
Apache License 2.0
2.34k stars 365 forks source link

Use of masks in feature detection #256

Open sandeepnmenon opened 1 year ago

sandeepnmenon commented 1 year ago

Is the purpose of masks to provide padding at the border of the image? https://github.com/zju3dv/LoFTR/blob/b4ee7eb0359d0062e794c99f73e27639d7c7ac9f/src/loftr/loftr.py#L35

I have a scenario where I want to detect feature matches in an image except in some area. A mask that I can provide. Like for example, I have some moving objects in the scene and I want LoFTR to ignore them to avoid spurious detections. Is there a way to provide a mask on the images telling the model to ignore finding feature matches in that area?

PS: I know that I can just ignore the feature matches in the mask after computing. Here I wanted to understand the purpose of these masks in LoFTR model.

nesi73 commented 1 year ago

I have the same doubt, did you get something or just post processed it?

JS901 commented 11 months ago

Hi @zehongs, do you have any thoughts on filtering of keypoints before matching? It looks like the current mask0, mask1inputs to loftr.forward() are only for padding the edge of the image to remove a black border.

zehongs commented 11 months ago

I think the mask should not be considered for the local CNN parts, because that will break the CNN receptive fields. And it is possible for the transformer to handle an arbitrary padding mask. But we use a variant called linear-attention, so I'm not sure about this. You can check it for yourself.