Open vicsyl opened 1 year ago
Some visual explanation of the problem:
@vpisarevI would like to contribute to this issue. I have already built opencv in my local system. Can you please share some resources regarding this issue so that I can work upon this?
And also please tell me what should be my base branch for this issue?
@RohanHBTU this has already been addressed in https://github.com/opencv/opencv/pull/23124
Addressed by https://github.com/opencv/opencv/pull/23124
The code in double_image
in test_descriptors_regression.impl.hpp
:
is copied to mirror the logic in sift.dispatch.cpp
(function createInitialImage
), so that it is shown that upscaling the image like that and downscaling again by nearest results in the same input image. As it wasn't straightforward where to export the function from within the production code the logic was copied like that for now.
System Information
System agnostic
Detailed description
The configurable yet default option is to upscale image first in the scale pyramid to double the original size. This is currently done by INTER_LINEAR_EXACT/INTER_LINEAR, which sample the image somewhat equidistantly given the area pixels cover as squares. In any case this means that the interpolated image along axis and flipped axis (imagine rotating or reflecting the image) is the same, but when downsampled again by nearest, the expected keypoint location is shifted by 0.5 when downsampling on the original axis (taking even pixels only) and the flipped axis (even pixels there = odd pixels on the original axis). The image downscaled again back to the original size is not the same on the flipped axis (i.e. this up/downscaling is not rotation/reflection equivariant) and has the bias of 1/4 pixels for rotations and (1-s)/4 pixels for downscaling (where s is the scale). The solution is to use a different interpolation scheme in upscaling in the scale pyramid, that simply map even pixel indices at 2x to pixels at x in the original image and interpolate between them for the odd indices. For indices at 2d-1 (d is the original size of the image along a given axis) let's replicate (would map to 2d - 2 in upscaled / to d-1 in the original image). This way the operation of upscale/downscale by 'nearest' would result in exactly the original input image and thus would become rotation / reflection equivariant. See the minimal example:
original axis: (0, 1, 2, 3) would become (0, 0.5, 1, 1.5, 2, 2.5, 3, 3) downscaled again by 'nearest': (0, 1, 2, 3)
flipped axis: (3, 2, 1, 0) would become (3, 2.5, 2, 1.5, 1, 0.5, 0, 0) downscaled again by 'nearest': (3, 2, 1, 0)
See this repo for notebooks showing the issue for OpenCV and Kornia: https://github.com/vicsyl/dog_precision
Most importantly https://github.com/vicsyl/dog_precision/blob/master/Accuracy%20of%20homography%20estimation.ipynb
Steps to reproduce
See https://github.com/vicsyl/dog_precision
Issue submission checklist