Evaluation problems - Githubissues

Hi, @janinethoma

thanks for your code and models! It will be quite useful for my futher research.

However, I could not achieve the desired results in evaluation, the results on tokyo24/7 dataset are not very ideal. My experiment steps are as follows and I would be very grateful if you can help me to deploy your models in an appropriate way.

1、Download the model "residual_det_0.0_eccv_000" and run "000_build_eval_lists.py" to get "tokyo_query.csv" and "tokyo_ref.csv".(I have prepared the file "tokyo.mat")

2、run "002_any_inference.py" and the command is python 002_any_inference.py --set tokyo_query --checkpoint ../residual_det_0.0_eccv_000/epoch-checkpoint-2 --out_root ../residual_det_0.0_eccv_000/tokyo --log_dir ../residual_det_0.0_eccv_000/tokyo Then I got "tokyo_query_None.pickle" and in the same way I got "tokyo_ref_None.pickle". The file "tokyo_query_None.pickle" saved a 315 × 32768 matrix and the file "tokyo_ref_None.pickle" saved a 75984 × 32768 matrix, which are query features and reference features.

3、Reducing the dimension of the query and reference features to 4096 by PCA, then I got "tokyo_query.npy" and "tokyo_ref.npy" which saved a 315 × 4096 and a 75984 × 4096 matrix respectively.

4、Using conine similarity measurement to compute the distance of reference features to every query feature, I can find the nearest [1, 5, 10] reference images to every query image. According to the groundtruth, I got the recall@[1, 5, 10] evaluation metrix results below:

Here is an explanation about the evaluation metric recall@[1, 5, 10], which is a script of the paper "NetVLAD: CNN Architecture for Weakly Supervised Place Recognition"

You can probably see my low test results of tokyo24/7 in the figure above, so I think there must be something wrong. Maybe I made a mistake or missed something？ I would really appreciate it if you could give me some advice.

Hi @Jack-xiaoxin,

Thank you for your interest in my research. residual_det_0.0_eccv_000 is a model that does not use netvlad but instead flattens the last fully convolutional vgg layer into a global feature. If you are using it in combination with a netvlad layer, that netvlad layer would have newly initialized weights and I would expect it to perform badly. To not use the netvlad layer, you need to use the --vlad_cores 0 flag when calling "002_any_inference.py". Have you also tried to use some of my models with NetVlad training? Another thing I have observed is that many models are sensitive to the size of the input image.

In my image matching pipeline and evaluations, I use the euclidean distance, not cosine similarity. Any form of hyperparameter optimization was also done using the euclidean distance-based matching. Perhaps you can also use the euclidean distance?

Does this solve your problem?

janinethoma / learning1M

Evaluation problems #4