Parskatt / DKM

[CVPR 2023] DKM: Dense Kernelized Feature Matching for Geometry Estimation
https://parskatt.github.io/DKM/
Other
378 stars 28 forks source link

about image resolution #64

Open KN-Zhang opened 2 months ago

KN-Zhang commented 2 months ago

Hi! In Table 8 of the original paper, do you keep the test image resolution the same as the training image resolution? For example, when training on 384x512 image pairs, do you also resize all the test image pairs to 384x512 for testing? Actually I am following roma. But I found this dense pipeline is a bit sensitive to the resolution setting. So I want to find a way to make the method generalize well to different resolutions. :)

Parskatt commented 2 months ago

Thats great, generalizing resolution is definitely something I would want.

As to your question, we set the resolution for the global/coarse matching to be a bit over the train resolution (perhaps 660x880 or so, cmp 540×720). It is indeed sensitive to this.

We then run the refinement (upsample re) at much higher resolution (typically maybe 1000px).

In general Ive found that the refiners are very robust to different resolutions (there are some caveats, look in the roma code for something like scale_factor).

If I were to do fix the resolution problem I would probably look at replacing the GP, as it scales poorly with high res. Perhaps flash attention instead? Let me know how it goes!