Pretrained models - Githubissues

Zhongdao commented 6 years ago

Hi, Thanks for your great work! But it seems that the pre-trained models were removed from the cloud drive. Could you please kindly upload them again ?

bnu-wangxun commented 6 years ago

Thx. I am sorry the model has been deleted from the cloud drive.

You can download by yourself from here: http://data.lip6.fr/cadene/pretrainedmodels/bn_inception-239d2248.pth

And I will make project more clear in next week.

Zhongdao commented 6 years ago

Thanks a lot! I have found this pre-trained model in https://github.com/Cadene/pretrained-models.pytorch#inception, is this exactly what you use?

bnu-wangxun commented 6 years ago

Yes! BN-inception

URL: http://data.lip6.fr/cadene/pretrainedmodels/bn_inception-239d2248.pth

Zhongdao commented 6 years ago

Hi @bnulihaixia , I have a small question about Deep Metric Learning, it would be great if you can give me some explanation. I just began reading Deep Metric Learning papers several days ago. I found it very confusing that almost no one in these DML papers uses a classification loss (i.e. softmax cross entropy) to fine-tune in CUB200/Cars196 datasets and serves it as a baseline. In your latest experimental results, the knnSoftmax get a 60+ recall@1 in CUB, I think it's reasonable. However in my experiment(based on your code) with a simple softmax loss, I can also get 60+ result. I am not sure if this result is reasonable since it's too high compared with other DML methods. So I'm wondering whether you have tried softmax loss, and if so what's the approximate performance?

p.s.I found a paper that uses softmax loss as baseline. Indeed they argue that softmax loss outperforms many DML methods. But they only report around 51% recall@1 on CUB.

Zhongdao commented 6 years ago

The paper which uses softmax loss as baseline:https://arxiv.org/pdf/1712.10151v1.pdf

bnu-wangxun commented 6 years ago

Thanks for your question.

I think the introduction of the paper of lifted structure loss have already given the answer.
Before reading the paper, I think the softmax loss may have good result on CUB and Car. But could not have very good performance on online - product.

I take a quick look at the paper, and see the thing just happened as I think. This is my point, may be not right.

bnu-wangxun commented 6 years ago

Zhongdao commented 6 years ago

@bnulihaixia I agree with that. Metric losses do better when training data per class becomes very scarce, and their complexity don't increase w.r.t the number of classes. These are advantages of metric losses.

However, even we know softmax has its drawback, I still think softmax loss should be a baseline, since some metric losses have similar form as softmax[1,2]. Researches can argue that their metric losses perform better than softmax in some scenario, but I don't think it's a good idea to just ignore softmax.

Moreover, the linear complexity of softmax also can be addressed, see [3], it seems that Sensetime trains a very large scale softmax loss (10M+ classes ) with this method and acquires pretty good performance in face recognition task.

Thank you for your time! Zhongdao

[1]Movshovitz-Attias etal. No Fuss Distance Metric Learning using Proxies [2]Meyer etal. Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks [3]Zhang etal. Accelerated Training for Massive Classification via Dynamic Class Selection

bnu-wangxun commented 6 years ago

I agree with you that softmax loss should be a baseline. [2] is just the same loss as ONCA in [1], and I have pointed it out in my README. The loss can be explained as Classification loss in some way, and also can be interpreted as standard metric loss as a weighted version of triplet loss.

Thanks for providing the paper of [3]. I will read the paper in these days.

[1]Salakhutdinov, R., Hinton, G.: Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure. In: International Conference on Artificial Intelligence and Statistics. (2007) [2]Meyer etal. Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks [3]Zhang etal. Accelerated Training for Massive Classification via Dynamic Class Selection

bnu-wangxun commented 6 years ago

@Zhongdao The paper which uses softmax loss as baseline:https://arxiv.org/pdf/1712.10151v1.pdf I have read the paper carefully these days. You reach 60+ recall@1 on CUB data, while the paper only reach 51. I think the reason may be you used an embedding dimension of 512 but the paper used an embedding dimension of 64. If you use the same embedding dimension, I think you would get similar performance!

Zhongdao commented 6 years ago

@bnulihaixia You're right, embedding size matters. Actually I did not add an embedding layer and directly fine-tune on top of layer pool5, from my observation this will bring some performance gain.

bnu-wangxun commented 6 years ago

@Zhongdao I have some question about the detail of your softmax training process，Can I have your telephone Number or we-chat?

XinshaoAmosWang commented 5 years ago

@bnulihaixia

Hi haixia, I am very interested in your great work. I am also working on deep metric learning and ReID, is there any chance I could have your we-chat for easy communication?

Thanks so much.

bnu-wangxun commented 5 years ago

yes, please give me your wechat account.

kumamonatseu commented 5 years ago

hi, can i also have your wechat? This is a research intern from Face++. i am also doing research on metric learning!

bnu-wangxun commented 5 years ago

Please send your wechat account to me, I will add you.

bnu-wangxun / Deep_Metric

Pretrained models #4