unable to reproduce the result of SBS(R50)

sijun-zhou commented 4 years ago

Hi, @L1aoXingyu , First, very appreciate for your excellent reid sota work.^_^. I am very interested in this project. I have tried the two SBS(R50) models on dataset market1501. But I do not get the result as mentioned in models zone, which has the following accuracy. I don't know why. Could you please help me? Thanks in advance!

Method | Pretrained | Rank@1 | mAP | mINP SBS(R50) | ImageNet | 95.4% | 88.2% | 64.8%

I will post my final config.yaml in the following.

My environment: Python 3.6, 2 titan XP GPUs, cuda 9.0, pytorch 1.5, torchvision 0.6.0

sijun-zhou commented 4 years ago

all are the default settings from the repo without any changes(I pull the repo 3 days ago).

yaml.config:

CUDNN_BENCHMARK: true DATALOADER: NUM_INSTANCE: 16 NUM_WORKERS: 0 PK_SAMPLER: true DATASETS: NAMES:

Market1501 TESTS:
Market1501 INPUT: DO_AUGMIX: false DO_AUTOAUG: true DO_CJ: false DO_FLIP: true DO_PAD: true FLIP_PROB: 0.5 PADDING: 10 PADDING_MODE: constant REA: ENABLED: true MEAN:
- 123.675
- 116.28
- 103.53 PROB: 0.5 RPT: ENABLED: false PROB: 0.5 SIZE_TEST:
384
128 SIZE_TRAIN:
384
128 MODEL: BACKBONE: DEPTH: 50 LAST_STRIDE: 1 NAME: build_resnet_backbone NORM: BN NORM_SPLIT: 1 PRETRAIN: true PRETRAIN_PATH: '' WITH_IBN: false WITH_NL: true WITH_SE: false HEADS: CLS_LAYER: circle IN_FEAT: 2048 MARGIN: 0.35 NAME: BNneckHead NECK_FEAT: after NORM: BN NORM_SPLIT: 1 NUM_CLASSES: 751 POOL_LAYER: gempool REDUCTION_DIM: 512 SCALE: 64 LOSSES: CE: ALPHA: 0.3 EPSILON: 0.1 SCALE: 1.0 FL: ALPHA: 0.25 GAMMA: 2 SCALE: 1.0 NAME:
- CrossEntropyLoss
- TripletLoss TRI: HARD_MINING: true MARGIN: 0.0 NORM_FEAT: false SCALE: 1.0 USE_COSINE_DIST: false META_ARCHITECTURE: Baseline OPEN_LAYERS: heads PIXEL_MEAN:
123.675
116.28
103.53 PIXEL_STD:
58.395
57.120000000000005
57.375 WEIGHTS: '' OUTPUT_DIR: logs/market1501/sbs_R50 SOLVER: BASE_LR: 0.00035 BIAS_LR_FACTOR: 2.0 CHECKPOINT_PERIOD: 6000 DELAY_ITERS: 9000 ETA_MIN_LR: 7.7e-07 FREEZE_ITERS: 2000 GAMMA: 0.1 HEADS_LR_FACTOR: 1.0 IMS_PER_BATCH: 64 LOG_PERIOD: 200 MAX_ITER: 18000 MOMENTUM: 0.9 OPT: Adam SCHED: DelayedCosineAnnealingLR STEPS:
30
55 SWA: ENABLED: false ETA_MIN_LR: 3.5e-06 ITER: 0 LR_FACTOR: 10.0 LR_SCHED: false PERIOD: 10 WARMUP_FACTOR: 0.01 WARMUP_ITERS: 2000 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0005 WEIGHT_DECAY_BIAS: 0.0 TEST: EVAL_PERIOD: 2000 IMS_PER_BATCH: 512 PRECISE_BN: DATASET: Market1501 ENABLED: false NUM_ITER: 300

Best Result:

L1aoXingyu commented 4 years ago

@sijun-zhou are u training SBS with 2 gpus?

sijun-zhou commented 4 years ago

I change the "HEADS:CLS_LAYER: circle" to "HEADS:CLS_LAYER: linear". All other settings are the same.

config.yaml:

CUDNN_BENCHMARK: true DATALOADER: NUM_INSTANCE: 16 NUM_WORKERS: 0 PK_SAMPLER: true DATASETS: NAMES:

Market1501 TESTS:
Market1501 INPUT: DO_AUGMIX: false DO_AUTOAUG: true DO_CJ: false DO_FLIP: true DO_PAD: true FLIP_PROB: 0.5 PADDING: 10 PADDING_MODE: constant REA: ENABLED: true MEAN:
- 123.675
- 116.28
- 103.53 PROB: 0.5 RPT: ENABLED: false PROB: 0.5 SIZE_TEST:
384
128 SIZE_TRAIN:
384
128 MODEL: BACKBONE: DEPTH: 50 LAST_STRIDE: 1 NAME: build_resnet_backbone NORM: BN NORM_SPLIT: 1 PRETRAIN: true PRETRAIN_PATH: '' WITH_IBN: false WITH_NL: true WITH_SE: false HEADS: CLS_LAYER: linear IN_FEAT: 2048 MARGIN: 0.35 NAME: BNneckHead NECK_FEAT: after NORM: BN NORM_SPLIT: 1 NUM_CLASSES: 751 POOL_LAYER: gempool REDUCTION_DIM: 512 SCALE: 64 LOSSES: CE: ALPHA: 0.3 EPSILON: 0.1 SCALE: 1.0 FL: ALPHA: 0.25 GAMMA: 2 SCALE: 1.0 NAME:
- CrossEntropyLoss
- TripletLoss TRI: HARD_MINING: true MARGIN: 0.0 NORM_FEAT: false SCALE: 1.0 USE_COSINE_DIST: false META_ARCHITECTURE: Baseline OPEN_LAYERS: heads PIXEL_MEAN:
123.675
116.28
103.53 PIXEL_STD:
58.395
57.120000000000005
57.375 WEIGHTS: '' OUTPUT_DIR: logs/market1501/sbs_R50 SOLVER: BASE_LR: 0.00035 BIAS_LR_FACTOR: 2.0 CHECKPOINT_PERIOD: 6000 DELAY_ITERS: 9000 ETA_MIN_LR: 7.7e-07 FREEZE_ITERS: 2000 GAMMA: 0.1 HEADS_LR_FACTOR: 1.0 IMS_PER_BATCH: 64 LOG_PERIOD: 200 MAX_ITER: 18000 MOMENTUM: 0.9 OPT: Adam SCHED: DelayedCosineAnnealingLR STEPS:
30
55 SWA: ENABLED: false ETA_MIN_LR: 3.5e-06 ITER: 0 LR_FACTOR: 10.0 LR_SCHED: false PERIOD: 10 WARMUP_FACTOR: 0.01 WARMUP_ITERS: 2000 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0005 WEIGHT_DECAY_BIAS: 0.0 TEST: EVAL_PERIOD: 2000 IMS_PER_BATCH: 512 PRECISE_BN: DATASET: Market1501 ENABLED: false NUM_ITER: 300

Best Result:

sijun-zhou commented 4 years ago

@sijun-zhou are u training SBS with 2 gpus?

@L1aoXingyu yes. with 2 GPUs.

L1aoXingyu commented 4 years ago

@sijun-zhou you need to train with 1 gpu, then you can reproduce the result or you need to use syncBN. But there is something wrong about syncBN, I will fix it tomorrow.

sijun-zhou commented 4 years ago

@sijun-zhou you need to train with 1 gpu, then you can reproduce the result.

@L1aoXingyu I'll try. thx!

L1aoXingyu commented 4 years ago

@sijun-zhou feel free to ask me anything here

L1aoXingyu commented 4 years ago

@sijun-zhou btw, you need to change HEADS.CLS_LAYER to circle, then you can reproduce the result. Pls keep all things in config the same.

sijun-zhou commented 4 years ago

@sijun-zhou btw, you need to change HEADS.CLS_LAYER to circle, then you can reproduce the result.

@sijun-zhou btw, you need to change HEADS.CLS_LAYER to circle, then you can reproduce the result. Pls keep all things in config the same.

got it. training in progress now. Any result got later will post here.

sijun-zhou commented 4 years ago

@L1aoXingyu I have reproduce for the result of SBS(R50) and SBS_R50-ibn on market1501, and got the nearly the same high accuracy as you mentioned in model zone. Thank you very much for the advice!

BTW. why one gpu can get a higher accuracy? and the accuracy will decrease with two gpus?

L1aoXingyu commented 4 years ago

This is because of BN. When you train with 2 GPUs, the normalization batch size is 32. Because 32 images(2 IDs) in each GPU to compute BN batch mean and batch var. This is biased. If you want to train with 2 GPUs with 64 batch size, you need to change the config file NORM: BN to NORM: syncBN, then the batch mean and var will be computed cross 2 GPUs. In this way, the normalization batch size is still 64. Or you can change the batch size to 128 which will ensure the normalization batch size is 64.

zhanghongruiupup commented 4 years ago

This is because of BN. When you train with 2 GPUs, the normalization batch size is 32. Because 32 images(2 IDs) in each GPU to compute BN batch mean and batch var. This is biased. If you want to train with 2 GPUs with 64 batch size, you need to change the config file NORM: BN to NORM: syncBN, then the batch mean and var will be computed cross 2 GPUs. In this way, the normalization batch size is still 64. Or you can change the batch size to 128 which will ensure the normalization batch size is 64.

你好，你说过circle loss最好是4个id 16个图像。我现在有4个gpu，准备设置 NORM: BN以及 256 batch,刚好一张GPU分配 64batch ,这样不会受到circle loss 影响吧？谢谢

JDAI-CV / fast-reid

unable to reproduce the result of SBS(R50) #54