naver / cgd

Combination of Multiple Global Descriptors for Image Retrieval
https://arxiv.org/abs/1903.10663
Apache License 2.0
147 stars 14 forks source link

What is the meaning of 'non-conventional usage' of backbone at Table 7 in the Paper? #4

Open knaffe opened 4 years ago

knaffe commented 4 years ago

Thank you for your great job of this repo and paper! I notice that the result of ResNet-50 with non-conventional usage has the best performance. I want to know how to implement this ' non-conventional usage'. Does it mean 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper? Thanks a lots.

kobiso commented 4 years ago

Thanks for the interest in our paper. For your question, it means 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper. So, you are right :)

knaffe commented 4 years ago

Thank you for your response ! So, 'discarding the down-sampling operation between stage3 and stage4' is changing the last stride of Resnet layer4 from 2 into 1 ? That is, last stride = 1 ?

These days I reimplement your great work with pytorch. When evaluating in CUB-200-2011, I just get 62% Recall@1. I believe I miss some important details. So, some questions raising:

  1. Batch sample: shuffer all samples and get 128 samples in a batch? or use P-K sampling format(P-classes, K samples per class)
  2. I find that without L2 norm and FC after GD( like another issue said) could get higher performance(can't still reach your proposed results in CUB-200). Do you know the reasons about this?
  3. Could you share some training tricks? Looking forward to your guidance. Thank you so much!