Closed mbsariyildiz closed 4 years ago
Hi, The results for ImageNet labels come from a training with Sobel filtering. Read Section 5.3 from deepcluster paper.
Table 5 reports the performance of a VGG-16 trained with different approaches obtained with Sobel filtering, except for Doersch et al. [25] and Wang et al. [46]. This preprocessing improves by 5.5 points the mAP of a supervised VGG-16 on the Oxford dataset, but not on Paris.
I assume you haven't trained your ImageNet labels baseline with Sobel, which would explain why you are not reproducing these numbers (especially the large performance gap for Oxford).
Hi, I've been investigating the difference of performance for DeeperCluster models. When loading DeeperCluster models for eval_retrieval.py
code, you should adjust the padding used for the sobel layers carefully.
Indeed, vgg-16 models with sobel filtering for this repo have a padding of 1 (see sobel.1):
(sobel): Sequential(
(0): Conv2d(3, 1, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(1, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
However, vgg-16 models with sobel filtering for DeeperCluster repo have a padding of 2 (see padding layer):
(padding): ConstantPad2d(padding=(2, 2, 2, 2), value=0.0)
(sobel): Sequential(
(0): Conv2d(3, 1, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(1, 2, kernel_size=(3, 3), stride=(1, 1))
)
Hence, when using DeeperCluster vgg-16 models with sobel filtering with this repo, you should increase the padding value to 2 instead of 1. For example, you can do so by adding this line:
vc = torch.nn.ConstantPad2d(1, 0)(vc)
just before this line and this line.
This way, you should be able to reproduce the numbers from DeeperCluster paper.
Hope that helps
Hello,
Thank you for addressing the issue.
I saw the caption of the corresponding table but found the code misleading. Because when
MODEL='pretrained'
here, the code loads a pre-trained models from torchvision repo here. Instead I expected you to share the pre-trained ImageNet model.
I haven't re-run the evaluation code with the modifications you suggested. But I think it is quite interesting that this level of fine details in sobel filtering (padding for instance) affects the results significantly.
Thanks
Dear @mathildecaron31
I am trying to reproduce the retrieval scores and having the following small issue. I downloaded the datasets and compiled the evaluation code according to your instructions in
eval_retrieval.sh
. When I run the code for "ImageNet pre-trained" and "DeeperCluster" models, I get the following results:The default setting of
eval_retrieval.sh
evaluates models on Paris. So, to evaluate the models on Oxford, I setDo you see any mistake here? Why do you think I get significantly and slightly lower results for "ImageNet labels" and "DeeperCluster" on Oxford, respectively?
Many thanks.