Parskatt / DKM

[CVPR 2023] DKM: Dense Kernelized Feature Matching for Geometry Estimation
https://parskatt.github.io/DKM/
Other
378 stars 28 forks source link

question about loading the training images #48

Closed Master-cai closed 9 months ago

Master-cai commented 9 months ago

Thanks for your excellent work.

I noticed a confusing detail. When you load images from Megadepth dataset, you resize the image to a fixed ht and wt, that will change the original aspect ratio(some images may have been taken vertically). I'm wondering why you do not maintain the aspect ratio and pad the image to the specified size, which is a common practice in other computer vision tasks. Is it because the padded areas significantly interfere with the estimation of the warp? If so, would masking out the warp generated by the padded areas be a good solution?

Thanks again!

Parskatt commented 9 months ago

Hi, good question.

The padding approach is what loftr does if I recall correctly (or perhaps they crop). We chose not to do so. In some sense this acts as augmentation.

It's possible that changing the dataloader could improve performance. Let me know what you find :)

Master-cai commented 9 months ago

thanks for your quick and insightful response! Augmentation is indeed a new perspective to consider the problem. I'm trying to do something related and i will come back if there are someting Impressive.