Closed mikeseven closed 2 years ago
Please provide your training configuration.
I got the same problem, and I just use the original config. configs/Market1501/sbs_R50.yml
Did you train the model with a single GPU or multiple GPUs?
Please provide more information if you need others' help.
I tried both single and multiple GPUs, the loss decrease at a very slow speed.
The log of the first 7 epochs is:
[03/13 14:47:22] fastreid.utils.checkpoint INFO: No checkpoint found. Training model from scratch [03/13 14:47:22] fastreid.engine.train_loop INFO: Starting training from epoch 0 [03/13 14:47:22] fastreid.engine.hooks INFO: Freeze layer group "backbone" training for 1000 iterations [03/13 14:48:18] fastreid.utils.events INFO: eta: 0:52:50 epoch/iter: 0/199 total_loss: 50.39 loss_cls: 50.39 loss_triplet: 0 time: 0.2666 data_time: 0.0025 lr: 6.59e-05 max_mem: 3369M [03/13 14:48:19] fastreid.utils.events INFO: eta: 0:52:49 epoch/iter: 0/201 total_loss: 50.39 loss_cls: 50.39 loss_triplet: 0 time: 0.2666 data_time: 0.0027 lr: 6.62e-05 max_mem: 3369M [03/13 14:49:11] fastreid.utils.events INFO: eta: 0:51:41 epoch/iter: 1/399 total_loss: 50.31 loss_cls: 50.31 loss_triplet: 0 time: 0.2643 data_time: 0.0015 lr: 9.74e-05 max_mem: 3369M [03/13 14:49:12] fastreid.utils.events INFO: eta: 0:51:40 epoch/iter: 1/403 total_loss: 50.3 loss_cls: 50.3 loss_triplet: 0 time: 0.2642 data_time: 0.0016 lr: 9.80e-05 max_mem: 3369M [03/13 14:50:05] fastreid.utils.events INFO: eta: 0:50:56 epoch/iter: 2/599 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2658 data_time: 0.0014 lr: 1.29e-04 max_mem: 3369M [03/13 14:50:07] fastreid.utils.events INFO: eta: 0:50:53 epoch/iter: 2/605 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2658 data_time: 0.0015 lr: 1.30e-04 max_mem: 3369M [03/13 14:50:58] fastreid.utils.events INFO: eta: 0:49:53 epoch/iter: 3/799 total_loss: 50.08 loss_cls: 50.08 loss_triplet: 0 time: 0.2650 data_time: 0.0014 lr: 1.60e-04 max_mem: 3369M [03/13 14:51:00] fastreid.utils.events INFO: eta: 0:49:50 epoch/iter: 3/807 total_loss: 50.08 loss_cls: 50.08 loss_triplet: 0 time: 0.2650 data_time: 0.0013 lr: 1.62e-04 max_mem: 3369M [03/13 14:51:51] fastreid.utils.events INFO: eta: 0:49:02 epoch/iter: 4/999 total_loss: 50.01 loss_cls: 50.01 loss_triplet: 0 time: 0.2648 data_time: 0.0012 lr: 1.92e-04 max_mem: 3369M [03/13 14:51:51] fastreid.engine.hooks INFO: Open layer group "backbone" training [03/13 14:51:53] fastreid.utils.events INFO: eta: 0:48:58 epoch/iter: 4/1009 total_loss: 50.02 loss_cls: 50.02 loss_triplet: 0 time: 0.2645 data_time: 0.0013 lr: 1.93e-04 max_mem: 3369M [03/13 14:52:40] fastreid.utils.events INFO: eta: 0:47:19 epoch/iter: 5/1199 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2608 data_time: 0.0013 lr: 2.23e-04 max_mem: 3369M [03/13 14:52:43] fastreid.utils.events INFO: eta: 0:47:11 epoch/iter: 5/1211 total_loss: 50.17 loss_cls: 50.17 loss_triplet: 0 time: 0.2605 data_time: 0.0013 lr: 2.25e-04 max_mem: 3369M [03/13 14:53:31] fastreid.utils.events INFO: eta: 0:45:48 epoch/iter: 6/1399 total_loss: 50.22 loss_cls: 50.22 loss_triplet: 0 time: 0.2582 data_time: 0.0017 lr: 2.55e-04 max_mem: 3369M [03/13 14:53:34] fastreid.utils.events INFO: eta: 0:45:43 epoch/iter: 6/1413 total_loss: 50.2 loss_cls: 50.2 loss_triplet: 0 time: 0.2581 data_time: 0.0016 lr: 2.57e-04 max_mem: 3369M [03/13 14:54:22] fastreid.utils.events INFO: eta: 0:44:13 epoch/iter: 7/1599 total_loss: 50.2 loss_cls: 50.2 loss_triplet: 0 time: 0.2568 data_time: 0.0019 lr: 2.86e-04 max_mem: 3369M [03/13 14:54:26] fastreid.utils.events INFO: eta: 0:44:06 epoch/iter: 7/1615 total_loss: 50.27 loss_cls: 50.27 loss_triplet: 0 time: 0.2566 data_time: 0.0017 lr: 2.88e-04 max_mem: 3369M [03/13 14:54:59] fastreid.engine.hooks INFO: Overall training speed: 1743 iterations in 0:07:26 (0.2559 s / it)
The config is the same as default:
`BASE: Base-bagtricks.yml
MODEL: FREEZE_LAYERS: [ backbone ]
BACKBONE: WITH_NL: True
HEADS: NECK_FEAT: after POOL_LAYER: GeneralizedMeanPoolingP CLS_LAYER: CircleSoftmax SCALE: 64 MARGIN: 0.35
LOSSES: NAME: ("CrossEntropyLoss", "TripletLoss",) CE: EPSILON: 0.1 SCALE: 1.0
TRI: MARGIN: 0.0 HARD_MINING: True NORM_FEAT: False SCALE: 1.0
INPUT: SIZE_TRAIN: [ 384, 128 ] SIZE_TEST: [ 384, 128 ]
AUTOAUG: ENABLED: True PROB: 0.1
DATALOADER: NUM_INSTANCE: 16
SOLVER: AMP: ENABLED: True OPT: Adam MAX_EPOCH: 60 BASE_LR: 0.00035 WEIGHT_DECAY: 0.0005 IMS_PER_BATCH: 64
SCHED: CosineAnnealingLR DELAY_EPOCHS: 30 ETA_MIN_LR: 0.0000007
WARMUP_FACTOR: 0.1 WARMUP_ITERS: 2000
FREEZE_ITERS: 1000
CHECKPOINT_PERIOD: 20
TEST: EVAL_PERIOD: 10 IMS_PER_BATCH: 128
CUDNN_BENCHMARK: True `
which dataset do you use?
which dataset do you use?
market1501
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
I'm facing the same issue. Did you solve it? Please share your solution
Running sbs_R50 or sbs_S50, metrics are very low, close to 0. Is it a bug in the model or in the metric reported?