Question about model pretrain method?

KovenYu / MAR

Pytorch code for our CVPR'19 (oral) work: Unsupervised person re-identification by soft multilabel learning

https://kovenyu.com/publication/2019-cvpr-mar/

315 stars 83 forks source link

Question about model pretrain method? #13

Closed ZhangYuef closed 5 years ago

ZhangYuef commented 5 years ago

Thanks for your sharing.

I find that the paper mentions that the model is pretrained only used $$L_{AL}$$first. In section 4.2,

i.e. we ﬁrst pretrain the network using only L AL (without enforcing the unit norm constraint) to endow the basic discriminative power with the embedding and to determine the directions of the reference agents in the hypersphere embedding....

And I don't know how to pretrain the model according to the code now. I need some more detailed instructions, e.g how many epoches should I pretrain the model.

Thanks >.<

KovenYu commented 5 years ago

Hi @ZhangYuef , thanks for your attention.

As you can see, in fact the pretraining is simply using a softmax classification loss, with every identity being a unique class. I didn't pay much attention on it or tune it, so I didn't remember the exact values of the hyperparameters. But it should be somewhere around:

epoch: 40 batchsize: 64 (not sure) lr: 1e-2 wd: 1e-2

You may tune a bit and obtain some reasonable results.

moodom commented 5 years ago

Hi @KovenYu ，thanks for your sharing. According to your settings, I reproduced the results of your pretrained model. but I found that I can't get the results in the paper when I train this pretrained model in the second-stage training. I think the parameter distribution of the pretrained model is very important for the parameter setting of the second-stage training. can you share the pretrained model code? Thank you very much.

KovenYu commented 5 years ago

hi @moodom thank you for your attention. Did you try using the provided pretrained model and is that working?

moodom commented 5 years ago

HI,@KovenYu. I had used the provided pretrained model and got a good result. But when I used the LAL loss as described in the paper and remove the unit norm constraint to train a pretrained model. After that, I used the pretrained model in the second stage of training and the rank 1 can only reach about 56. I tried to adjust LR and WD. the results were the same. I tested the average parameters of the provided pretrained model in the FC layer and the Euclidean distance between the FC layer column vector. The results are as follows: Average parameters of FC layer: - 0.00755771 Column Vector Euclidean Distance Mean: - 413379.0 Standard deviation of column vector Euclidean distance: 1.8415+08 I think it's a very good result. The parameters are very small, but the distance is very large. But the pretrained model I trained did not reach that level. Do you use any other training skills?

KovenYu commented 5 years ago

@moodom thank you for your detailed description! I looked at the pretrained code and I find two notable points:

By "without the unit norm constraint" it means both a and f(x) are not normalized, and the scale factor 30 is also not used.
I find that I actually tried a few different pretraining strategies, and chose a best baseline obtained by using ImageNet initialized weights (downloaded from here), then trained for 60 epochs with batchsize=256 (256 SOURCE images without any target images, unlike in the current code), and increase LR to 1e-3. Other settings (incl. data augmentation, lr strategy, etc.) were the same as in the code.

pzhren commented 4 years ago

Thank you for sharing the code. I set the corresponding parameters according to your description and want to re-loss_al pre-training. However, the pre-training weights obtained in the second stage of training appeared a large number of nan cases. The following is my pre-training code: https://github.com/pzhren/Papers/blob/master/%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B%E4%B8%8Ere-id%E4%BB%BB%E5%8A%A1/MAR-master/src/pretrain.py#L6

pzhren commented 4 years ago

The following are the hyperparameter settings during pre-training. python version : 3.5.4 |Continuum Analytics, Inc.| (default, Aug 14 2017, 13:26:58) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] torch version : 1.1.0

------------------------------------------------------- options -------------------------------------------------------- batch_size: 256 beta: 0.2 crop_size: (384, 128)
epochs: 60 gpu: 0 img_size: (384, 128)
lamb_1: 0.0002 lamb_2: 50.0 lr: 0.001
margin: 1.0 mining_ratio: 0.005 ml_path: ../data/ml_Market.dat
padding: 7 pretrain: True pretrain_path: ../data/resnet50-19c8e357.pth print_freq: 100 resume: save_path: ../runs/debug
scala_ce: 30.0 source: MSMT17 target: Market
wd: 0.025

do not use pre-trained model. train from scratch. loaded pre-trained model from ../data/resnet50-19c8e357.pth

==>>[2020-03-20 18:12:12] [Epoch=000/060] Stage 1, [Need: 00:00:00] Iter: [000/969] Freq 37.5 loss_total 8.316 loss_source 8.316