ducha-aiki / affnet

Code and weights for local feature affine shape estimation paper "Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability"
MIT License
266 stars 47 forks source link

How to extract patches in Oxford Dataset? #6

Closed SikaStar closed 6 years ago

SikaStar commented 6 years ago

Hi, I cannot understand the details about extracting patches and I have some questions:

  1. How many patches have you extracted in every image in the experiment of Oxford5k retrieval?
  2. How to extract patches when you used the function extract_patches_from_pyr, does it have something to do with "Spatial Transformer Networks"? Because you used the torch.nn.functional.affine_grid and torch.nn.functional.grid_sample functions in Pytorch in LAF.py.
  3. What is the parameter scale_pyramid, pyr_inv_idxs, in extract_patches_from_pyramid_with_inv_index
ducha-aiki commented 6 years ago

Hi,

  1. I have run https://github.com/perdoch/hesaff as original detector -- it prints in console the number, e.g.:

Detected 3774 keypoints and 2178 affine shapes in 0.57952 sec.

Then I read first 3774 number and used it as a desired number of features.

  1. You just use it as here, see https://github.com/ducha-aiki/affnet/blob/master/examples/hesaffnet/WBS%20demo.ipynb

    LAFs, resp = det(img)
    patches = detector.extract_patches_from_pyr(LAFs, PS = 32)

"Spatial transformer networks" are just differentiable bilinear image sampling. I used it, because it is GPU-accelerated, not because of gradients. So it is just a function to extract the patches.

  1. If you extract huge patch from image and then downsample it to 32x32, you will have HUGE aliasing problem. To avoid this, one need first filter out high frequences by Gaussian blur as low-pass filter. https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem

One can do it per patch, but this would be extremely slow, especially on GPU. We have compared timings on CPU in this paper: ftp://cmp.felk.cvut.cz/pub/cmp/articles/matas/lenc-2014-features-cvww.pdf Instead one can create Gaussian pyramid, which is image, blurred by different amount of blur. And depending on scale of the feature == size of patch in original image, select proper level of the pyramid and extract patch directly from there. https://en.wikipedia.org/wiki/Pyramid_(image_processing)

I also suggest you to read original SIFT paper, where the typical pipeline for local feature extraction are explained https://www.robots.ox.ac.uk/~vgg/research/affine/det_eval_files/lowe_ijcv2004.pdf

ducha-aiki commented 6 years ago

@SikaStar code added. Comment out this line https://github.com/ducha-aiki/affnet/blob/master/examples/hesaffnet/hesaffnet.py#L26 to turn on threshold version instead of "num_features"

ducha-aiki commented 6 years ago

@SikaStar you might be interested in this https://github.com/ducha-aiki/mods-light-zmq It integrates pytorch AffNet and HardNet with state-of-the art C++ matching code. It also has extract features option, one which was used in AffNet paper