QVPR / Patch-NetVLAD

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"
MIT License
531 stars 76 forks source link

Qustion about the function get_integral_feature(feat_in) #81

Closed euncheolChoi closed 1 year ago

euncheolChoi commented 1 year ago

First of all, thank you for sharing your great work.

I'd like to know why we do the cumulative sum in the get_integral_feature(feat_in) function of the patchNetvlad model. For example, when the Tensor Size ([5, 8192, 30, 40] comes in, we proceed with the cumulative sum in the direction of dim=-1, dim=-2, and I'd like to know why.

Referring to the paper, it is explained that these expressions help the subsequent operation of extracting Patch-Descriptor from multi-scale.

  1. I wonder how integral feature representation helps with computation.
  2. Also, I'm wondering if the integral feature itself also has a meaning as another feature that is characteristic.
  3. Lastly, to recover the Patch feature at any scale, the code use the kernel [[1, -1], [-1, 1]] in the Convolution layer. I'd like to know how I can get Patch feature on any scale using this kernel. ( in function get_square_regions_from_integral(feat_integral, patch_size, patch_stride): )

Thanks in advance.

Tobias-Fischer commented 1 year ago

@StephenHausler - one for you, please

StephenHausler commented 1 year ago

Hi @euncheolChoi

Sure, I can help with this. I've provided answers to your questions below.

1) The integral feature helps by aggregating patch features using the GPU, using standard pytorch functions like conv2d. But the main benefit, computationally, is when multiple patch sizes are used at the same time, since a single integral feature can be used for all patch sizes (without this, we would need to perform separate indexing per patch size and this would be slower).

2) I don't think it does, since their are no learnt parameters in the integral feature computation.

3) Hmm, I'm not sure if that is possible. In our code, any patch size can be recovering by changing the dilation parameter on line 68 of patchnetvlad.py. I think changing this would achieve what you need? Any scale can be recovered by changing the dilation parameter. Based on the mathematics of an integral feature, changing the kernel would not provide the desired effect (or at least, I don't think a larger kernel would be a neat mapping from kernel size to patch size), but changing the dilation parameter will provide the desired effect.

Hope this helps, let me know if you have any other questions.