JDAI-CV / fast-reid

SOTA Re-identification Methods and Toolbox
Apache License 2.0
3.42k stars 837 forks source link

SBS R50/S50 metrics very low #641

Closed mikeseven closed 2 years ago

mikeseven commented 2 years ago

Running sbs_R50 or sbs_S50, metrics are very low, close to 0. Is it a bug in the model or in the metric reported?

L1aoXingyu commented 2 years ago

Please provide your training configuration.

RuoyuFeng commented 2 years ago

I got the same problem, and I just use the original config. configs/Market1501/sbs_R50.yml

L1aoXingyu commented 2 years ago

Did you train the model with a single GPU or multiple GPUs?

Please provide more information if you need others' help.

RuoyuFeng commented 2 years ago

I tried both single and multiple GPUs, the loss decrease at a very slow speed.

The log of the first 7 epochs is:

[03/13 14:47:22] fastreid.utils.checkpoint INFO: No checkpoint found. Training model from scratch [03/13 14:47:22] fastreid.engine.train_loop INFO: Starting training from epoch 0 [03/13 14:47:22] fastreid.engine.hooks INFO: Freeze layer group "backbone" training for 1000 iterations [03/13 14:48:18] fastreid.utils.events INFO: eta: 0:52:50 epoch/iter: 0/199 total_loss: 50.39 loss_cls: 50.39 loss_triplet: 0 time: 0.2666 data_time: 0.0025 lr: 6.59e-05 max_mem: 3369M [03/13 14:48:19] fastreid.utils.events INFO: eta: 0:52:49 epoch/iter: 0/201 total_loss: 50.39 loss_cls: 50.39 loss_triplet: 0 time: 0.2666 data_time: 0.0027 lr: 6.62e-05 max_mem: 3369M [03/13 14:49:11] fastreid.utils.events INFO: eta: 0:51:41 epoch/iter: 1/399 total_loss: 50.31 loss_cls: 50.31 loss_triplet: 0 time: 0.2643 data_time: 0.0015 lr: 9.74e-05 max_mem: 3369M [03/13 14:49:12] fastreid.utils.events INFO: eta: 0:51:40 epoch/iter: 1/403 total_loss: 50.3 loss_cls: 50.3 loss_triplet: 0 time: 0.2642 data_time: 0.0016 lr: 9.80e-05 max_mem: 3369M [03/13 14:50:05] fastreid.utils.events INFO: eta: 0:50:56 epoch/iter: 2/599 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2658 data_time: 0.0014 lr: 1.29e-04 max_mem: 3369M [03/13 14:50:07] fastreid.utils.events INFO: eta: 0:50:53 epoch/iter: 2/605 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2658 data_time: 0.0015 lr: 1.30e-04 max_mem: 3369M [03/13 14:50:58] fastreid.utils.events INFO: eta: 0:49:53 epoch/iter: 3/799 total_loss: 50.08 loss_cls: 50.08 loss_triplet: 0 time: 0.2650 data_time: 0.0014 lr: 1.60e-04 max_mem: 3369M [03/13 14:51:00] fastreid.utils.events INFO: eta: 0:49:50 epoch/iter: 3/807 total_loss: 50.08 loss_cls: 50.08 loss_triplet: 0 time: 0.2650 data_time: 0.0013 lr: 1.62e-04 max_mem: 3369M [03/13 14:51:51] fastreid.utils.events INFO: eta: 0:49:02 epoch/iter: 4/999 total_loss: 50.01 loss_cls: 50.01 loss_triplet: 0 time: 0.2648 data_time: 0.0012 lr: 1.92e-04 max_mem: 3369M [03/13 14:51:51] fastreid.engine.hooks INFO: Open layer group "backbone" training [03/13 14:51:53] fastreid.utils.events INFO: eta: 0:48:58 epoch/iter: 4/1009 total_loss: 50.02 loss_cls: 50.02 loss_triplet: 0 time: 0.2645 data_time: 0.0013 lr: 1.93e-04 max_mem: 3369M [03/13 14:52:40] fastreid.utils.events INFO: eta: 0:47:19 epoch/iter: 5/1199 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2608 data_time: 0.0013 lr: 2.23e-04 max_mem: 3369M [03/13 14:52:43] fastreid.utils.events INFO: eta: 0:47:11 epoch/iter: 5/1211 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2605 data_time: 0.0013 lr: 2.25e-04 max_mem: 3369M [03/13 14:53:31] fastreid.utils.events INFO: eta: 0:45:48 epoch/iter: 6/1399 total_loss: 50.22 loss_cls: 50.22 loss_triplet: 0 time: 0.2582 data_time: 0.0017 lr: 2.55e-04 max_mem: 3369M [03/13 14:53:34] fastreid.utils.events INFO: eta: 0:45:43 epoch/iter: 6/1413 total_loss: 50.2 loss_cls: 50.2 loss_triplet: 0 time: 0.2581 data_time: 0.0016 lr: 2.57e-04 max_mem: 3369M [03/13 14:54:22] fastreid.utils.events INFO: eta: 0:44:13 epoch/iter: 7/1599 total_loss: 50.2 loss_cls: 50.2 loss_triplet: 0 time: 0.2568 data_time: 0.0019 lr: 2.86e-04 max_mem: 3369M [03/13 14:54:26] fastreid.utils.events INFO: eta: 0:44:06 epoch/iter: 7/1615 total_loss: 50.27 loss_cls: 50.27 loss_triplet: 0 time: 0.2566 data_time: 0.0017 lr: 2.88e-04 max_mem: 3369M [03/13 14:54:59] fastreid.engine.hooks INFO: Overall training speed: 1743 iterations in 0:07:26 (0.2559 s / it)

The config is the same as default:

`BASE: Base-bagtricks.yml

MODEL: FREEZE_LAYERS: [ backbone ]

BACKBONE: WITH_NL: True

HEADS: NECK_FEAT: after POOL_LAYER: GeneralizedMeanPoolingP CLS_LAYER: CircleSoftmax SCALE: 64 MARGIN: 0.35

LOSSES: NAME: ("CrossEntropyLoss", "TripletLoss",) CE: EPSILON: 0.1 SCALE: 1.0

TRI:
  MARGIN: 0.0
  HARD_MINING: True
  NORM_FEAT: False
  SCALE: 1.0

INPUT: SIZE_TRAIN: [ 384, 128 ] SIZE_TEST: [ 384, 128 ]

AUTOAUG: ENABLED: True PROB: 0.1

DATALOADER: NUM_INSTANCE: 16

SOLVER: AMP: ENABLED: True OPT: Adam MAX_EPOCH: 60 BASE_LR: 0.00035 WEIGHT_DECAY: 0.0005 IMS_PER_BATCH: 64

SCHED: CosineAnnealingLR DELAY_EPOCHS: 30 ETA_MIN_LR: 0.0000007

WARMUP_FACTOR: 0.1 WARMUP_ITERS: 2000

FREEZE_ITERS: 1000

CHECKPOINT_PERIOD: 20

TEST: EVAL_PERIOD: 10 IMS_PER_BATCH: 128

CUDNN_BENCHMARK: True `

L1aoXingyu commented 2 years ago

which dataset do you use?

RuoyuFeng commented 2 years ago

which dataset do you use?

market1501

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 years ago

This issue was closed because it has been inactive for 14 days since being marked as stale.

giaanthunder commented 1 year ago

I'm facing the same issue. Did you solve it? Please share your solution