andrefaraujo / videosearch

Large-scale video retrieval using image queries.
289 stars 105 forks source link

Training trained_parameters but get a very low mAP #13

Closed NinaJina closed 5 years ago

NinaJina commented 6 years ago

Hi @andrefarauj, I am trying to train trained_parameters, following #10. But I get a very low mAP and P@1 on the standford600k dataset. (about 0.1 mAP, and 0.26 P@1). I want to describe some details, and hope you can give me some advice about how to improve mAP. 1) I first sampled about 20000 frame from the whole dataset, then gather all the local descriptors of the 20000 frames and shuffle them. After that sample about 150*20000 frames from all these local descriptors, and use the pca function in yael lib to reduce the 128 dim descriptor to 32 dim. 2) Next, I use the gmm function in yael to train gmm. I set niter to 60, centroid to 512. It takes about 10 hours to train the gmm. 3) For the corr_weights, I still use the old one.(sift.pre_alpha.0.50.pca.32.gmm.512.pre_alpha.0.50.corr_weights). Actually I am not quite sure whether this must also be regenerated. But since I am using the same sift descriptor, I think it is not the reason why the mAP is so low. Is there anything wrong in the whole process? Can you give me some advice? Thanks!

andrefaraujo commented 6 years ago

Hi @NinaJina, that sounds correct. You are right that the corr_weights should not be the reason of such low performance.

Some issues I can think:

NinaJina commented 6 years ago

Hi @andrefaraujo,Thanks for your reply. The local descriptor I'm using is exactually the SIFT descriptor extraction from this repository. I also tested that param on the small part of data, which gives 0.23mAP and 0.26P@1.(201302xx of the standford600k, and the 201302 dir in query. I selected the relevant ground truth from the all GTs) Moveover, if I use 201302xx data to train the param,(sampled 6000 frames, and finally 150*6000 local descriptors, 40 gmm iters), I get 0.25 mAP, 0.26P@1. [Don't know whether these two pieces of information help, but I think this result is higher then random, which means the gmm param works to some extend?]

BTW, I want to make sure several things. 1) Does sift.pre_alpha.0.50.desc_eigenvectors and sift.pre_alpha.0.50.desc_covariance means pca->eigvec and pca->cov? 2) float* pDesc = new float[nNumFeatures*nDescLength]; gmm_t* g = gmm_learn(d, n, k, niter, pDesc, 1, 1, redo,0); Does the *(pDesc+i*d+j) means the j-th element of the i-th sift descriptor? [the description in the yael lib about the matrix seems a little confusing to me.] 3) Did I omitted any step which may cause the low mAP? Thanks!

andrefaraujo commented 6 years ago
  1. Yes, but sift.pre_alpha.0.50.desc_covariance also contains the descriptor mean (you can see that the descriptor mean is read here using this file).
  2. The code snippet you pasted here is from the yael library, right? I think you are correct, but I could be wrong (it's been some time that I've used yael). But, again, I think you are right.
  3. So, what settings are you using here? First, I guess you are using the DoG detector, right? (ie, not the Hessian-Affine one, see this link). Usually, the Hessian-Affine tends to work better, but may not explain the low performance you are currently getting. Second, which type of retrieval are you doing? (frame-based, shot-based, scene-based?) The latter will give lower result, but should be a lot faster, and may be combined with the former ones to improve mAP. Third, which asym_scoring_mode (link) are you using? The default (QAGS) should be reasonable, but experimenting with SGS could also be tried.

These are some ideas, let me know how it goes :)

NinaJina commented 6 years ago

Hi @andrefaraujo, sorry for the late reply, I have been very busy these days, so I have little time to continue the experiment.

It is a long time since I replied last time, so let me first briefly repeat the issue: I am trying to generate the parameters in trained_parameters, but get 0.1 mAP on standford600k dataset.

For several points you suggested last time:

  1. I am using the default setting. Specifically, the DoG detector, frame-based retrieval, and the default QAGS.

  2. Previously I only save pca->cov the in sift.pre_alpha.0.50.desc_covariance, now I firstly save pca->mean in sift.pre_alpha.0.50.desc_covariance then save pca->cov in that file as you suggested. But this only brings a very slight improvement. As I missed pca->mean in the file, I think previously the code wrongly use the first row of pca->cov as pca->mean. I think what the code here doing is just minus a mean value from each feature vector, so even I am using the wrong value previously, this should not greatly hurt mAP, because it is only a constant value. [I'm not sure whether this idea is correct]

  3. Since the code here only load pca->mean, and no where else in the code use file sift.pre_alpha.0.50.desc_covariance again, I think pca->cov is not used by the code. Is that correct?

  4. I tried to increase training data, and that seems to help a lot. Previously I sampled 150*20000 SIFT from standford600k, now I sample 150*80000, and that improve the mAP from 0.1 to 0.2. Is this increase in mAP reasonable for 4 times larger training data? And how much data do think reasonable for training gmm for standford600k?

  5. I am now sampling SIFT from standford600k to train the gmm, but your paper mentioned that you sampled SIFT from Flickr dataset. Do you think Flickr is better?

Thanks again for your reply! I will go on the experiment, and if there is any progress, I'll reply here!

andrefaraujo commented 6 years ago
  1. Possibly, the mean and the first row of the covariance matrix are small values, close to zero. That could be a reason for not seeing too much of a difference in performance.
  2. Correct.
  3. Great! Yes, that seems reasonable
  4. In the paper, we prefer to train parameters on datasets which are different from the datasets we use for retrieval experiments. In terms of performance, I believe you should get higher performance if training directly on Stanford600k.
andrefaraujo commented 5 years ago

Closing due to lack of activity, please feel free to re-open if necessary.