Closed sijun-zhou closed 3 years ago
Below is my training configs: (fastreid) root@sj_docker1_117:/home/wesine/data_8tb_3/sj/work/reid/fast-reid $ cd /home/wesine/data_8tb_3/sj/work/reid/fast-reid ; env PYTHONIOENCODING=UTF-8 PYTHONUNBUFFERED=1 /root/anaconda3/envs/fastreid/bin/python /root/.vscode-server/extensions/ms-python.python-2020.2.64397/pythonFiles/ptvsd_launcher.py --default --nodebug --client --host localhost --port 43535 /home/wesine/data_8tb_3/sj/work/reid/fast-reid/tools/train_net.py --config-file ./configs/Market1501/sbs_R50-ibn.yml --num-gpus 2 Command Line Args: Namespace(config_file='./configs/Market1501/sbs_R50-ibn.yml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=2, num_machines=1, opts=[], resume=False) [03/22 09:47:02 fastreid]: Rank of current process: 0. World size: 2 [03/22 09:47:03 fastreid]: Environment info:
sys.platform linux Python 3.6.13 | packaged by conda-forge | (default, Feb 19 2021, 05:36:01) [GCC 9.3.0]
numpy 1.19.5
fastreid 1.0.0 @/home/wesine/data_8tb_3/sj/work/reid/fast-reid/fastreid
FASTREID_ENV_MODULE |
---|
PyTorch built with:
[03/22 09:47:03 fastreid]: Command line arguments: Namespace(config_file='./configs/Market1501/sbs_R50-ibn.yml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=2, num_machines=1, opts=[], resume=False) [03/22 09:47:03 fastreid]: Contents of args.config_file=./configs/Market1501/sbs_R50-ibn.yml: BASE: ../Base-SBS.yml
MODEL: BACKBONE: WITH_IBN: True
DATASETS: NAMES: ("Market1501",) TESTS: ("Market1501",)
OUTPUT_DIR: logs/market1501/sbs_R50-ibn
[03/22 09:47:03 fastreid]: Running with full config:
CUDNN_BENCHMARK: True
DATALOADER:
NAIVE_WAY: True
NUM_INSTANCE: 16
NUM_WORKERS: 8
PK_SAMPLER: True
DATASETS:
COMBINEALL: False
NAMES: ('Market1501',)
TESTS: ('Market1501',)
INPUT:
AUGMIX_PROB: 0.0
AUTOAUG_PROB: 0.1
CJ:
BRIGHTNESS: 0.15
CONTRAST: 0.15
ENABLED: False
HUE: 0.1
PROB: 0.5
SATURATION: 0.1
DO_AFFINE: False
DO_AUGMIX: False
DO_AUTOAUG: True
DO_FLIP: True
DO_PAD: True
FLIP_PROB: 0.5
PADDING: 10
PADDING_MODE: constant
REA:
ENABLED: True
PROB: 0.5
VALUE: [123.675, 116.28, 103.53]
RPT:
ENABLED: False
PROB: 0.5
SIZE_TEST: [384, 128]
SIZE_TRAIN: [384, 128]
KD:
MODEL_CONFIG: ['']
MODEL_WEIGHTS: ['']
MODEL:
BACKBONE:
DEPTH: 50x
FEAT_DIM: 2048
LAST_STRIDE: 1
NAME: build_resnet_backbone
NORM: BN
PRETRAIN: True
PRETRAIN_PATH:
WITH_IBN: True
WITH_NL: True
WITH_SE: False
DEVICE: cuda
FREEZE_LAYERS: ['backbone']
HEADS:
CLS_LAYER: circleSoftmax
EMBEDDING_DIM: 0
MARGIN: 0.35
NAME: EmbeddingHead
NECK_FEAT: after
NORM: BN
NUM_CLASSES: 0
POOL_LAYER: gempoolP
SCALE: 64
WITH_BNNECK: True
LOSSES:
CE:
ALPHA: 0.2
EPSILON: 0.1
SCALE: 1.0
CIRCLE:
GAMMA: 128
MARGIN: 0.25
SCALE: 1.0
COSFACE:
GAMMA: 128
MARGIN: 0.25
SCALE: 1.0
FL:
ALPHA: 0.25
GAMMA: 2
SCALE: 1.0
NAME: ('CrossEntropyLoss', 'TripletLoss')
TRI:
HARD_MINING: True
MARGIN: 0.0
NORM_FEAT: False
SCALE: 1.0
META_ARCHITECTURE: Baseline
PIXEL_MEAN: [123.675, 116.28, 103.53]
PIXEL_STD: [58.395, 57.120000000000005, 57.375]
QUEUE_SIZE: 8192
WEIGHTS:
OUTPUT_DIR: logs/market1501/sbs_R50-ibn
SOLVER:
BASE_LR: 0.00035
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 20
DELAY_EPOCHS: 30
ETA_MIN_LR: 7e-07
FP16_ENABLED: False
FREEZE_FC_ITERS: 0
FREEZE_ITERS: 1000
GAMMA: 0.1
HEADS_LR_FACTOR: 1.0
IMS_PER_BATCH: 64
MAX_EPOCH: 60
MOMENTUM: 0.9
NESTEROV: True
OPT: Adam
SCHED: CosineAnnealingLR
STEPS: [40, 90]
WARMUP_FACTOR: 0.1
WARMUP_ITERS: 2000
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0005
WEIGHT_DECAY_BIAS: 0.0005
TEST:
AQE:
ALPHA: 3.0
ENABLED: False
QE_K: 5
QE_TIME: 1
EVAL_PERIOD: 10
FLIP_ENABLED: False
IMS_PER_BATCH: 128
METRIC: cosine
PRECISE_BN:
DATASET: Market1501
ENABLED: False
NUM_ITER: 300
RERANK:
ENABLED: False
K1: 20
K2: 6
LAMBDA: 0.3
ROC_ENABLED: False
[03/22 09:47:03 fastreid]: Full config saved to /home/wesine/data_8tb_3/sj/work/reid/fast-reid/logs/market1501/sbs_R50-ibn/config.yaml
[03/22 09:47:03 fastreid.utils.env]: Using a generated random seed 3342157
[03/22 09:47:03 fastreid.engine.defaults]: Prepare training set
[03/22 09:47:03 fastreid.data.datasets.bases]: => Loaded Market1501 in csv format:
subset # ids # images # cameras
|:---------|:--------|:-----------|:------------|
| train | 751 | 12936 | 6 |
[03/22 09:47:03 fastreid.engine.defaults]: Auto-scaling the num_classes=751
[03/22 09:47:04 fastreid.modeling.backbones.resnet]: Loading pretrained model from /root/.cache/torch/checkpoints/resnet50_ibn_a-d9d0bb7b.pth
[03/22 09:47:04 fastreid.modeling.backbones.resnet]: Some model parameters or buffers are not found in the checkpoint:
NL_2.0.g.{weight, bias}
NL_2.0.W.0.{weight, bias}
NL_2.0.W.1.{weight, bias, running_mean, running_var}
NL_2.0.theta.{weight, bias}
NL_2.0.phi.{weight, bias}
NL_2.1.g.{weight, bias}
NL_2.1.W.0.{weight, bias}
NL_2.1.W.1.{weight, bias, running_mean, running_var}
NL_2.1.theta.{weight, bias}
NL_2.1.phi.{weight, bias}
NL_3.0.g.{weight, bias}
NL_3.0.W.0.{weight, bias}
NL_3.0.W.1.{weight, bias, running_mean, running_var}
NL_3.0.theta.{weight, bias}
NL_3.0.phi.{weight, bias}
NL_3.1.g.{weight, bias}
NL_3.1.W.0.{weight, bias}
NL_3.1.W.1.{weight, bias, running_mean, running_var}
NL_3.1.theta.{weight, bias}
NL_3.1.phi.{weight, bias}
NL_3.2.g.{weight, bias}
NL_3.2.W.0.{weight, bias}
NL_3.2.W.1.{weight, bias, running_mean, running_var}
NL_3.2.theta.{weight, bias}
NL_3.2.phi.{weight, bias}
[03/22 09:47:04 fastreid.modeling.backbones.resnet]: The checkpoint state_dict contains keys that are not used by the model:
fc.{weight, bias}
Baseline( (backbone): ResNet( (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (bn1): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True) (layer1): Sequential( (0): Bottleneck( (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (layer2): Sequential( (0): Bottleneck( (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (3): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (layer3): Sequential( (0): Bottleneck( (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (3): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (4): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (5): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (layer4): Sequential( (0): Bottleneck( (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (NL_1): ModuleList() (NL_2): ModuleList( (0): Non_local( (g): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 512, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) ) (1): Non_local( (g): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 512, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) ) ) (NL_3): ModuleList( (0): Non_local( (g): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 1024, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) ) (1): Non_local( (g): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 1024, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) ) (2): Non_local( (g): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 1024, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) ) ) (NL_4): ModuleList() ) (heads): EmbeddingHead( (pool_layer): GeneralizedMeanPoolingP(Parameter containing: tensor([3.], device='cuda:0', requires_grad=True), output_size=1) (bottleneck): Sequential( (0): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (classifier): CircleSoftmax(in_features=2048, num_classes=751, scale=64, margin=0.35) ) ) [03/22 09:47:14 fastreid.utils.checkpoint]: No checkpoint found. Training model from scratch [03/22 09:47:14 fastreid.engine.train_loop]: Starting training from epoch 0 [03/22 09:47:14 fastreid.engine.hooks]: Freeze layer group "backbone" training for 1000 iterations [03/22 09:47:30 fastreid.utils.events]: eta: 0:12:59 epoch/iter: 0/199 total_loss: 64.82 loss_cls: 50.34 loss_triplet: 14.5 time: 0.0668 data_time: 0.0010 lr: 6.63e-05 max_mem: 9426M [03/22 09:47:30 fastreid.utils.events]: eta: 0:12:57 epoch/iter: 0/201 total_loss: 64.8 loss_cls: 50.34 loss_triplet: 14.47 time: 0.0667 data_time: 0.0011 lr: 6.67e-05 max_mem: 9426M [03/22 09:47:44 fastreid.utils.events]: eta: 0:13:04 epoch/iter: 1/399 total_loss: 64.29 loss_cls: 49.86 loss_triplet: 14.4 time: 0.0681 data_time: 0.0007 lr: 9.78e-05 max_mem: 9426M [03/22 09:47:44 fastreid.utils.events]: eta: 0:13:02 epoch/iter: 1/403 total_loss: 64.29 loss_cls: 49.84 loss_triplet: 14.4 time: 0.0681 data_time: 0.0008 lr: 9.85e-05 max_mem: 9426M [03/22 09:47:58 fastreid.utils.events]: eta: 0:12:50 epoch/iter: 2/599 total_loss: 63.29 loss_cls: 48.93 loss_triplet: 14.38 time: 0.0681 data_time: 0.0008 lr: 1.29e-04 max_mem: 9426M [03/22 09:47:58 fastreid.utils.events]: eta: 0:12:48 epoch/iter: 2/605 total_loss: 63.28 loss_cls: 48.91 loss_triplet: 14.42 time: 0.0681 data_time: 0.0007 lr: 1.30e-04 max_mem: 9426M [03/22 09:48:12 fastreid.utils.events]: eta: 0:12:38 epoch/iter: 3/799 total_loss: 61.76 loss_cls: 47.67 loss_triplet: 14.23 time: 0.0683 data_time: 0.0007 lr: 1.61e-04 max_mem: 9426M [03/22 09:48:12 fastreid.utils.events]: eta: 0:12:38 epoch/iter: 3/807 total_loss: 61.73 loss_cls: 47.65 loss_triplet: 14.23 time: 0.0684 data_time: 0.0008 lr: 1.62e-04 max_mem: 9426M [03/22 09:48:25 fastreid.utils.events]: eta: 0:12:21 epoch/iter: 4/999 total_loss: 60.35 loss_cls: 46.21 loss_triplet: 14 time: 0.0682 data_time: 0.0009 lr: 1.92e-04 max_mem: 9426M [03/22 09:48:25 fastreid.engine.hooks]: Open layer group "backbone" training
inferece accuracy is also far lower than the accuracy posted in the model zone. Any one can help me solve this out? Thanks in advance!
(fastreid) root@sj_docker1_117:/home/wesine/data_8tb_3/sj/work/reid/fast-reid $ cd /home/wesine/data_8tb_3/sj/work/reid/fast-reid ; env PYTHONIOENCODING=UTF-8 PYTHONUNBUFFERED=1 /root/anaconda3/envs/fastreid/bin/python /root/.vscode-server/extensions/ms-python.python-2020.2.64397/pythonFiles/ptvsd_launcher.py --default --nodebug --client --host localhost --port 41755 /home/wesine/data_8tb_3/sj/work/reid/fast-reid/tools/train_net.py --config-file ./configs/Market1501/sbs_R50-ibn.yml --eval-only MODEL.WEIGHTS logs/market1501/sbs_R50-ibn/model_best.pth MODEL.DEVICE cuda:0 Command Line Args: Namespace(config_file='./configs/Market1501/sbs_R50-ibn.yml', dist_url='tcp://127.0.0.1:49152', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', 'logs/market1501/sbs_R50-ibn/model_best.pth', 'MODEL.DEVICE', 'cuda:0'], resume=False) [03/22 11:49:03 fastreid]: Rank of current process: 0. World size: 1 [03/22 11:49:04 fastreid]: Environment info:
sys.platform linux Python 3.6.13 | packaged by conda-forge | (default, Feb 19 2021, 05:36:01) [GCC 9.3.0]
numpy 1.19.5
fastreid 1.0.0 @/home/wesine/data_8tb_3/sj/work/reid/fast-reid/fastreid
FASTREID_ENV_MODULE |
---|
PyTorch built with:
[03/22 11:49:04 fastreid]: Command line arguments: Namespace(config_file='./configs/Market1501/sbs_R50-ibn.yml', dist_url='tcp://127.0.0.1:49152', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', 'logs/market1501/sbs_R50-ibn/model_best.pth', 'MODEL.DEVICE', 'cuda:0'], resume=False) [03/22 11:49:04 fastreid]: Contents of args.config_file=./configs/Market1501/sbs_R50-ibn.yml: BASE: ../Base-SBS.yml
MODEL: BACKBONE: WITH_IBN: True
DATASETS: NAMES: ("Market1501",) TESTS: ("Market1501",)
OUTPUT_DIR: logs/market1501/sbs_R50-ibn
[03/22 11:49:04 fastreid]: Running with full config: CUDNN_BENCHMARK: True DATALOADER: NAIVE_WAY: True NUM_INSTANCE: 16 NUM_WORKERS: 8 PK_SAMPLER: True DATASETS: COMBINEALL: False NAMES: ('Market1501',) TESTS: ('Market1501',) INPUT: AUGMIX_PROB: 0.0 AUTOAUG_PROB: 0.1 CJ: BRIGHTNESS: 0.15 CONTRAST: 0.15 ENABLED: False HUE: 0.1 PROB: 0.5 SATURATION: 0.1 DO_AFFINE: False DO_AUGMIX: False DO_AUTOAUG: True DO_FLIP: True DO_PAD: True FLIP_PROB: 0.5 PADDING: 10 PADDING_MODE: constant REA: ENABLED: True PROB: 0.5 VALUE: [123.675, 116.28, 103.53] RPT: ENABLED: False PROB: 0.5 SIZE_TEST: [384, 128] SIZE_TRAIN: [384, 128] KD: MODEL_CONFIG: [''] MODEL_WEIGHTS: [''] MODEL: BACKBONE: DEPTH: 50x FEAT_DIM: 2048 LAST_STRIDE: 1 NAME: build_resnet_backbone NORM: BN PRETRAIN: True PRETRAIN_PATH: WITH_IBN: True WITH_NL: True WITH_SE: False DEVICE: cuda:0 FREEZE_LAYERS: ['backbone'] HEADS: CLS_LAYER: circleSoftmax EMBEDDING_DIM: 0 MARGIN: 0.35 NAME: EmbeddingHead NECK_FEAT: after NORM: BN NUM_CLASSES: 0 POOL_LAYER: gempoolP SCALE: 64 WITH_BNNECK: True LOSSES: CE: ALPHA: 0.2 EPSILON: 0.1 SCALE: 1.0 CIRCLE: GAMMA: 128 MARGIN: 0.25 SCALE: 1.0 COSFACE: GAMMA: 128 MARGIN: 0.25 SCALE: 1.0 FL: ALPHA: 0.25 GAMMA: 2 SCALE: 1.0 NAME: ('CrossEntropyLoss', 'TripletLoss') TRI: HARD_MINING: True MARGIN: 0.0 NORM_FEAT: False SCALE: 1.0 META_ARCHITECTURE: Baseline PIXEL_MEAN: [123.675, 116.28, 103.53] PIXEL_STD: [58.395, 57.120000000000005, 57.375] QUEUE_SIZE: 8192 WEIGHTS: logs/market1501/sbs_R50-ibn/model_best.pth OUTPUT_DIR: logs/market1501/sbs_R50-ibn SOLVER: BASE_LR: 0.00035 BIAS_LR_FACTOR: 1.0 CHECKPOINT_PERIOD: 20 DELAY_EPOCHS: 30 ETA_MIN_LR: 7e-07 FP16_ENABLED: False FREEZE_FC_ITERS: 0 FREEZE_ITERS: 1000 GAMMA: 0.1 HEADS_LR_FACTOR: 1.0 IMS_PER_BATCH: 64 MAX_EPOCH: 60 MOMENTUM: 0.9 NESTEROV: True OPT: Adam SCHED: CosineAnnealingLR STEPS: [40, 90] WARMUP_FACTOR: 0.1 WARMUP_ITERS: 2000 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0005 WEIGHT_DECAY_BIAS: 0.0005 TEST: AQE: ALPHA: 3.0 ENABLED: False QE_K: 5 QE_TIME: 1 EVAL_PERIOD: 10 FLIP_ENABLED: False IMS_PER_BATCH: 128 METRIC: cosine PRECISE_BN: DATASET: Market1501 ENABLED: False NUM_ITER: 300 RERANK: ENABLED: False K1: 20 K2: 6 LAMBDA: 0.3 ROC_ENABLED: False [03/22 11:49:04 fastreid]: Full config saved to /home/wesine/data_8tb_3/sj/work/reid/fast-reid/logs/market1501/sbs_R50-ibn/config.yaml [03/22 11:49:04 fastreid.utils.env]: Using a generated random seed 4471883
Baseline( (backbone): ResNet( (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (bn1): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True) (layer1): Sequential( (0): Bottleneck( (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (layer2): Sequential( (0): Bottleneck( (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (3): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (layer3): Sequential( (0): Bottleneck( (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (3): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (4): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (5): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): IBN( (IN): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) (BN): BatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (layer4): Sequential( (0): Bottleneck( (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() (downsample): Sequential( (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) (2): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (se): Identity() ) ) (NL_1): ModuleList() (NL_2): ModuleList( (0): Non_local( (g): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 512, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) ) (1): Non_local( (g): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 512, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1)) ) ) (NL_3): ModuleList( (0): Non_local( (g): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 1024, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) ) (1): Non_local( (g): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 1024, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) ) (2): Non_local( (g): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (W): Sequential( (0): Conv2d(1, 1024, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (theta): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) (phi): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)) ) ) (NL_4): ModuleList() ) (heads): EmbeddingHead( (pool_layer): GeneralizedMeanPoolingP(Parameter containing: tensor([3.], device='cuda:0', requires_grad=True), output_size=1) (bottleneck): Sequential( (0): BatchNorm(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (classifier): CircleSoftmax(in_features=2048, num_classes=0, scale=64, margin=0.35) ) ) [03/22 11:49:08 fastreid.utils.checkpoint]: Loading checkpoint from logs/market1501/sbs_R50-ibn/model_best.pth WARNING [03/22 11:49:09 fastreid.utils.checkpoint]: Skip loading parameter 'heads.classifier.weight' to the model due to incompatible shapes: (751, 2048) in the checkpoint but (0, 2048) in the model! You might want to double check if this is expected. [03/22 11:49:09 fastreid.utils.checkpoint]: Some model parameters or buffers are not found in the checkpoint: heads.classifier.weight [03/22 11:49:09 fastreid.engine.defaults]: Prepare testing set [03/22 11:49:09 fastreid.data.datasets.bases]: => Loaded Market1501 in csv format: subset # ids # images # cameras | :--------- | :-------- | :----------- | :------------ | query | 750 | 3368 | 6 | gallery | 751 | 15913 | 6 | [03/22 11:49:09 fastreid.evaluation.evaluator]: Start inference on 19281 images [03/22 11:49:12 fastreid.evaluation.evaluator]: Inference done 11/151. 0.2056 s / batch. ETA=0:00:28 [03/22 11:49:41 fastreid.evaluation.evaluator]: Total inference time: 0:00:30.367810 (0.207999 s / batch per device, on 1 devices) [03/22 11:49:41 fastreid.evaluation.evaluator]: Total inference pure compute time: 0:00:30 (0.205810 s / batch per device, on 1 devices) [03/22 11:49:47 fastreid.engine.defaults]: Evaluation results for Market1501 in csv format: [03/22 11:49:47 fastreid.evaluation.testing]: Evaluation results in csv format: | Dataset | Rank-1 | Rank-5 | Rank-10 | mAP | mINP | metric | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Market1501 | 93.94 | 97.57 | 98.28 | 81.89 | 48.57 | 87.91 |
@L1aoXingyu hi, L1aoXingyu. If you have time, could u plz have a look for this problems? Thanks in advance!
hello, have you loaded pretrained model successfully ? you can join wechat group. https://github.com/JDAI-CV/fast-reid/issues/354
@sijun-zhou You can firstly try to use 1 GPU to reproduce the results in the model zoo. If you use 2 GPUs, you need to tune batch size twice.
hello, have you loaded pretrained model successfully ? you can join wechat group.
354
hi gmt710 , you can have a look for my train log pasted above. It shows that the training using pretrain model. "[03/22 09:47:04 fastreid.modeling.backbones.resnet]: Loading pretrained model from /root/.cache/torch/checkpoints/resnet50_ibn_a-d9d0bb7b.pth".
And i pasted the snippet of the above log here. You can have a check, including missing keys and keys that not used:
###################################################### [03/22 09:47:04 fastreid.modeling.backbones.resnet]: Loading pretrained model from /root/.cache/torch/checkpoints/resnet50_ibn_a-d9d0bb7b.pth [03/22 09:47:04 fastreid.modeling.backbones.resnet]: Some model parameters or buffers are not found in the checkpoint: NL_2.0.g.{weight, bias} NL_2.0.W.0.{weight, bias} NL_2.0.W.1.{weight, bias, running_mean, running_var} NL_2.0.theta.{weight, bias} NL_2.0.phi.{weight, bias} NL_2.1.g.{weight, bias} NL_2.1.W.0.{weight, bias} NL_2.1.W.1.{weight, bias, running_mean, running_var} NL_2.1.theta.{weight, bias} NL_2.1.phi.{weight, bias} NL_3.0.g.{weight, bias} NL_3.0.W.0.{weight, bias} NL_3.0.W.1.{weight, bias, running_mean, running_var} NL_3.0.theta.{weight, bias} NL_3.0.phi.{weight, bias} NL_3.1.g.{weight, bias} NL_3.1.W.0.{weight, bias} NL_3.1.W.1.{weight, bias, running_mean, running_var} NL_3.1.theta.{weight, bias} NL_3.1.phi.{weight, bias} NL_3.2.g.{weight, bias} NL_3.2.W.0.{weight, bias} NL_3.2.W.1.{weight, bias, running_mean, running_var} NL_3.2.theta.{weight, bias} NL_3.2.phi.{weight, bias} [03/22 09:47:04 fastreid.modeling.backbones.resnet]: The checkpoint state_dict contains keys that are not used by the model: fc.{weight, bias} ######################################################
@sijun-zhou You can firstly try to use 1 GPU to reproduce the results in the model zoo. If you use 2 GPUs, you need to tune batch size twice.
@L1aoXingyu Hi, L1aoXingyu, I have tested with 1 GPU, which got nearly the same result as you posted in the model zone. Thank you very much!
BTW. I don't quite understand what does "you need to tune batch size twice" mean, if I want to use 2 GPUs. Could you plz give me a more specific guidelines or description? Thanks a lot!
It means if you want to train a model with 2 GPUs, you need to tune the batch size from 64 to 128.
@L1aoXingyu 最新代码训练多卡训练测试问题 1、2卡训练,batch to 256,训练没有问题,但是测试的时候,返回的结果是空的, 单卡测试正常, 2、超参数问题 Freeze 和 warmup 是 迭代数?, 根据自己的数据量和batch 计算出 iter, 是不是通常计算到10个epoch的迭代数, 因为超参数的其他好像是 epoch 数量, 就这两个参数好像 是迭代数, 有歧义,可以说明一下?
@sky186
WARMUP_ITER
和 MAX_EPOCH
@L1aoXingyu 您好,请问最新的代码 数据处理 到提取特征部分和之前有哪里不同吗? 因为之前的版本抽取了一个提取特征的代码接口, 正确,测试结果正确 这里我换成最新版本训练的模型和config 提取特征后,测试结果完全不正确,其他设置都是一致的,代码有点多不知道怎么找那些可能修改。
@sky186 是不是 model 没有 load 进去呢? 另外我测试了一下,多卡测试是可以跑的,多卡测试时,只会在主进程返回结果
@L1aoXingyu 1、嗯是的,经检查,模型参数的加载这边没有真的加载成功,做了修改,现在好了,超级感谢~ 2、谢谢您的回复, 多卡的时候测试结果返回空, 在 defaults.py/ def test(cls,cfg,model,evaluators=None ) 这里有个测试的results , 多卡的时候这里返回是空的。 您说的主进程返回结果,大概是在哪里尼
@sky186 你从哪里拿的测试返回结果? https://github.com/JDAI-CV/fast-reid/blob/25cfa88fd97fbef55abcdd1bf69f2db822306bff/fastreid/evaluation/reid_evaluation.py#L55 这里的代码表示非主进程,返回空的 {}
my environment: python3.6 pytorch 1.2.0 cuda 10.0.130 apex 0.1 GPU 2*2080TI
I train it with 2 2080ti gpu card on market1501 dataset with all default settings of sbs_R50-ibn.yml but i cannot reproduce the results My highest results is as follows for highest top1(92.64%) and map(78.78%) respectively, which is far less then the model zone 95.7%(top1) and 89.3%(map):
###########################################################################################
###########################################################################################