question about loading the training images

Master-cai commented 9 months ago

Thanks for your excellent work.

I noticed a confusing detail. When you load images from Megadepth dataset, you resize the image to a fixed ht and wt, that will change the original aspect ratio(some images may have been taken vertically). I'm wondering why you do not maintain the aspect ratio and pad the image to the specified size, which is a common practice in other computer vision tasks. Is it because the padded areas significantly interfere with the estimation of the warp? If so, would masking out the warp generated by the padded areas be a good solution?

Thanks again!

Parskatt commented 9 months ago

Hi, good question.

The padding approach is what loftr does if I recall correctly (or perhaps they crop). We chose not to do so. In some sense this acts as augmentation.

It's possible that changing the dataloader could improve performance. Let me know what you find :)

Master-cai commented 9 months ago

thanks for your quick and insightful response! Augmentation is indeed a new perspective to consider the problem. I'm trying to do something related and i will come back if there are someting Impressive.

Parskatt / DKM

question about loading the training images #48