Performance not as good on MARS [soln: combine by avg]

YurongYou commented 7 years ago

Hi, we have been doing some experiments to reproduce your results on the Market1501 and MARS dataset, and when using the exactly the same hyperparameters and training strategy in your paper, we have successfully reproduce the results on the Market1501 dataset. However, we could not reproduce the result on MARS under the exactly same settings, and the rank1 CMC is only 75. Do you have any ideas on this? Thanks!

Pandoro commented 7 years ago

Hi!

Are you using the code we uploaded recently, or are you using your own implementation? Indeed we did not rerun our MARS experiments with the latest code push, but it should be straight forward to do so. Apart from changing to TensorFlow everything is very similar to the original code used for the paper experiments. There we used the exact same setup for both MARS and Market-1501.

In order to run the code with MARS you will need to create the .csv files for the MARS train and test sets, which should be sufficient to train. For evaluation you will need to create a new "matcher" in our code, since the evaluation is a little different. Alternatively, you could use the original MatLab code from the MARS dataset to evaluate the performance. However, there you also need to make some changes as to how the files are preprocessed. The embeddings are optimized for a euclidean distance metric and should not be renormalized or preprocessed by PCA, both of which is done by default in the original evaluation script.

We don't have time for it right now, but we might find time in a few weeks to make the code changes needed to directly run MARS experiments with our new code.

Pandoro commented 7 years ago

In fact another addition to my previous answer! MARS is evaluated on a tracklet level and not on an image level. This actually means that only writing a new matcher will not be sufficient given the batched evaluation style we used here.

An additional important thing to consider in that case is that the original evaluation code pools the embeddings of images in a tracklet by using the maximum value. In our case the mean value of all the embeddings worked better.

YurongYou commented 7 years ago

Hi! Thanks for your advice! We tried to use mean pooling rather than max pooling and we obtained similar results! Euclidean distance: mAP = 0.684650, r1 precision = 0.804545 re. Euclidean distance: mAP = 0.765708, r1 precision = 0.816667

VisualComputingInstitute / triplet-reid

Performance not as good on MARS [soln: combine by avg] #9