jixunbo / LightMBN

MIT License
73 stars 21 forks source link

I run your code but the performance is low #11

Closed mumuner closed 3 years ago

mumuner commented 3 years ago

Hi, I run your code but the performance is low, the test performance is as follows. mAP: 0.2936 rank1: 0.5246 rank3: 0.6838 rank5: 0.7524 rank10: 0.8284 (Best: 0.2936 @epoch 130.0) Time used: 416 m 0 s And the loss is still high. [INFO] Epoch: 130 Learning rate: 6.00e-07 Time used: 409 m 54 s [INFO] [130/130] 234/234 [CrossEntropy: 8.623139][MSLoss: 1.460011][Total: 10.083151] Time used: 412 m 52 s How to solve this problem? Please.

Ostyk commented 3 years ago

Hi, that's odd. Are you training on Market-1501? I'm around halfway in my training and I'm getting a much lower loss than you image

mumuner commented 3 years ago

Yes, I'm training on Market-1501, and I had tried twice but get the almost same result.

------------------ 原始邮件 ------------------ 发件人: "jixunbo/LightMBN" @.>; 发送时间: 2021年8月26日(星期四) 晚上9:06 @.>; @.**@.>; 主题: Re: [jixunbo/LightMBN] I run your code but the performance is low (#11)

Hi, that's odd. Are you training on Market-1501? I'm around halfway in my training and I'm getting a much lower loss than you

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

jixunbo commented 3 years ago

Problem solved? If not, could you show me your config file? And have you successfully downloaded pretrained osnet? You can also try on resnet.

Ostyk commented 3 years ago

Ok, my training finished. [INFO] mAP: 0.8938 rank1: 0.9596 rank3: 0.9825 rank5: 0.9887 rank10: 0.9920 (Best: 0.8952 @epoch 120.0) Time used: 322 m 25 s Also I used the default parameters with resnet

mumuner commented 3 years ago

Hi, I download the pretrained file, it show :Successfully loaded imagenet pretrained weights from "/home/zhouzh/ShuJia_KeYan/LightMBN-master/pth/osnet_x1_0_imagenet.pth The email attachment is lmbn_config.yaml, I only change the dataset path. But I still can't get good result on market-1501 use osnet, that's strange.

And I trained on dukemtmc with osnet:

What's more, I also trained on market-1501 with resnet-50:mAP: 0.8435 rank1: 0.9350 rank3: 0.9700 rank5: 0.9798 rank10: 0.9884 (Best: 0.8453 @epoch 110.0) Time used: 172 m 52 s

------------------ 原始邮件 ------------------ 发件人: "jixunbo/LightMBN" @.>; 发送时间: 2021年8月27日(星期五) 下午3:57 @.>; @.>;"State @.>; 主题: Re: [jixunbo/LightMBN] I run your code but the performance is low (#11)

Ok, my training finished. [INFO] mAP: 0.8938 rank1: 0.9596 rank3: 0.9825 rank5: 0.9887 rank10: 0.9920 (Best: 0.8952 @epoch 120.0) Time used: 322 m 25 s Also I used the default parameters with resnet

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Ostyk commented 3 years ago

Hi, yeah I just checked the performance running the test after training and the results are low

**************** Summary ****************
  train            : ['market1501']
  # train datasets : 1
  # train ids      : 751
  # train images   : 12936
  # train cameras  : 6
  test             : ['market1501']
  # query images   : 3368
  # gallery images : 15913
  *****************************************

[INFO] Building LMBN_n model... Time used: 0 m 1 s
Successfully loaded imagenet pretrained weights from "/home/mostyk/.cache/torch/checkpoints/osnet_x1_0_imagenet.pth"
/home/mostyk/DTIQ/PoC/DunkinDonuts/env_DD/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
[INFO] Model parameters: 3584 flops: 2430837088 Time used: 0 m 4 s
[INFO] GPU: NVIDIA GeForce RTX 3090 Time used: 0 m 4 s
[INFO] Starting from epoch 1 Time used: 0 m 4 s

[INFO] Test: Time used: 0 m 4 s
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[INFO] mAP: 0.0510 rank1: 0.1755 rank3: 0.2616 rank5: 0.3049 rank10: 0.3705 (Best: 0.0510 @epoch 0.0) Time used: 4 m 51 s
(env_DD) (base) mostyk@linux-spc-ai-01:~/DTIQ/LightMBN$
jixunbo commented 3 years ago

I’ll check it this week. I saw some warnings occurred. Which pytorch version are you using? I was using 1.7.

mumuner commented 3 years ago

Hi, my torch version is 1.8.1.

------------------ 原始邮件 ------------------ 发件人: "jixunbo/LightMBN" @.>; 发送时间: 2021年8月31日(星期二) 下午3:30 @.>; @.>;"State @.>; 主题: Re: [jixunbo/LightMBN] I run your code but the performance is low (#11)

I’ll check it this week. I saw some warnings occurred. Which pytorch version are you using? I was using 1.7.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

jixunbo commented 3 years ago

Hi, I have tested the LMBN_n model on google Colab and the outputted performance after the 20th epoch is [INFO] mAP: 0.8419 rank1: 0.9323 rank3: 0.9647 rank5: 0.9754 rank10: 0.9857 (Best: 0.8419 @epoch 20.0), although some warnings exist. Maybe you could try it on Colab ? and check again?

Ostyk commented 3 years ago

Hi, thanks for the update. I've actually managed to fix it. Had to specify the model path a bit differently because it would read it but then re-set it to initial weights (not the trained ones).

jixunbo commented 3 years ago

yes that could be the reason, but why would it re-set? at which step/line?

mumuner commented 3 years ago

hi, I found that the error happend in : /home/zhouzh/ShuJia_KeYan/LightMBN-master/utils/utility.py:279: UserWarning: The pretrained weights "/home/zhouzh/ShuJia_KeYan/LightMBN-master/experiment/2021-10-08-21:21:58/model-latest.pth" cannot be loaded, please check the key names manually ( ignored and continue ) Just place the 256 to 289 line code with model.load_state_dict(state_dict) can figure this problem.