Open arsity opened 2 hours ago
Config file is as follows, where I modify GPUS
, IMAGES_PER_GPU
and WORKERS
.
AUTO_RESUME: True
DATA_DIR: ''
GPUS: [0]
LOG_DIR: log
OUTPUT_DIR: output
PRINT_FREQ: 25
VERBOSE: False
MULTIPROCESSING_DISTRIBUTED: True
CUDNN:
BENCHMARK: True
DETERMINISTIC: False
ENABLED: True
DATASET:
DATASET: exlpose_kpt
DATA_FORMAT: png
FLIP: 0.5
INPUT_SIZE: 512
OUTPUT_SIZE: 128
MAX_NUM_PEOPLE: 30
MAX_ROTATION: 30
MAX_SCALE: 1.5
SCALE_TYPE: 'short'
MAX_TRANSLATE: 40
MIN_SCALE: 0.75
NUM_JOINTS: 14
ROOT: '../datasets/ExLPose'
TEST: all
TRAIN: train
OFFSET_RADIUS: 4
SIGMA: 2.0
CENTER_SIGMA: 4.0
BG_WEIGHT: 0.1
ADJUST_ELLA: True
LOSS:
WITH_HEATMAPS_LOSS: True
HEATMAPS_LOSS_FACTOR: 1.0
WITH_OFFSETS_LOSS: True
OFFSETS_LOSS_FACTOR: 0.03
WITH_AE_LOSS: True
AE_LOSS_TYPE: exp
PUSH_LOSS_FACTOR: 0.001
PULL_LOSS_FACTOR: 0.001
MODEL:
SPEC:
FINAL_CONV_KERNEL: 1
PRETRAINED_LAYERS: ['*']
STAGES:
NUM_STAGES: 3
NUM_MODULES:
- 1
- 4
- 3
NUM_BRANCHES:
- 2
- 3
- 4
BLOCK:
- BASIC
- BASIC
- BASIC
NUM_BLOCKS:
- [4, 4]
- [4, 4, 4]
- [4, 4, 4, 4]
NUM_CHANNELS:
- [32, 64]
- [32, 64, 128]
- [32, 64, 128, 256]
FUSE_METHOD:
- SUM
- SUM
- SUM
HEAD_HEATMAP:
BLOCK: BASIC
NUM_BLOCKS: 1
NUM_CHANNELS: 32
DILATION_RATE: 1
HEAD_OFFSET:
BLOCK: ADAPTIVE
NUM_BLOCKS: 2
NUM_CHANNELS_PERKPT: 15
DILATION_RATE: 1
INIT_WEIGHTS: True
NAME: hrnet_main
NUM_JOINTS: 14
PRETRAINED_MAIN: 'model/imagenet/hrnet_w32-36af842e.pth'
PRETRAINED_COMP: 'model/imagenet/hrnet_w32-36af842e.pth'
TAG_PER_JOINT: True
TEST:
FLIP_TEST: True
IMAGES_PER_GPU: 1
MODEL_FILE: ''
SCALE_FACTOR: [1]
NMS_THRE: 0.15
NMS_NUM_THRE: 8
KEYPOINT_THRESHOLD: 0.01
ADJUST_THRESHOLD: 0.05
MAX_ABSORB_DISTANCE: 75
GUASSIAN_KERNEL: 6
DECREASE: 0.9
DETECTION_THRESHOLD: 0.1
TRAIN:
BEGIN_EPOCH: 0
CHECKPOINT: ''
END_EPOCH: 10
GAMMA1: 0.99
GAMMA2: 0.0
IMAGES_PER_GPU: 10
LR: 0.001
LR_FACTOR: 0.1
LR_STEP: [90, 120]
MOMENTUM: 0.9
NESTEROV: False
OPTIMIZER: adam
RESUME: False
SHUFFLE: True
WD: 0.0001
MAX_NUM_CENTERS: 15
TEACHER_THRESHOLD: 0.9
STAGE: 'KA'
WORKERS: 8
And this is the training log.
Namespace(cfg='experiments/exlpose/KA_stage_config.yaml', opts=['DATASET.ROOT', '/home/haopeng/datasets/ExLPose', 'MODEL.PRETRAINED_MAIN', 'output/exlpose_kpt/hrnet_main/PT_stage_config/final_state0.pth.tar', 'MODEL.PRETRAINED_COMP', 'output/exlpose_kpt/hrnet_comp/PT_stage_config/final_state0.pth.tar'], gpu=None, world_size=1, dist_url='tcp://127.0.0.1:23456', rank=0)
AUTO_RESUME: True
CUDNN:
BENCHMARK: True
DETERMINISTIC: False
ENABLED: True
DATASET:
ADJUST_ELLA: True
BG_WEIGHT: 0.1
CENTER_SIGMA: 4.0
DATASET: exlpose_kpt
DATASET_TEST:
DATA_FORMAT: png
FLIP: 0.5
INPUT_SIZE: 512
MAX_NUM_PEOPLE: 30
MAX_ROTATION: 30
MAX_SCALE: 1.5
MAX_TRANSLATE: 40
MIN_SCALE: 0.75
NIGHTTIME: False
NUM_JOINTS: 14
OFFSET_RADIUS: 4
OUTPUT_SIZE: 128
ROOT: /home/haopeng/datasets/ExLPose
SCALE_TYPE: short
SIGMA: 2.0
TEST: all
TRAIN: train
WITH_CENTER: False
DATA_DIR:
DIST_BACKEND: nccl
GPUS: (0,)
LOG_DIR: log
LOSS:
AE_LOSS_TYPE: exp
HEATMAPS_LOSS_FACTOR: 1.0
OFFSETS_LOSS_FACTOR: 0.03
PULL_LOSS_FACTOR: 0.001
PUSH_LOSS_FACTOR: 0.001
WITH_AE_LOSS: True
WITH_HEATMAPS_LOSS: True
WITH_OFFSETS_LOSS: True
MODEL:
INIT_WEIGHTS: True
NAME: hrnet_main
NUM_JOINTS: 14
PRETRAINED:
PRETRAINED_CLASS:
PRETRAINED_COMP: output/exlpose_kpt/hrnet_comp/PT_stage_config/final_state0.pth.tar
PRETRAINED_MAIN: output/exlpose_kpt/hrnet_main/PT_stage_config/final_state0.pth.tar
SPEC:
FINAL_CONV_KERNEL: 1
HEAD_HEATMAP:
BLOCK: BASIC
DILATION_RATE: 1
NUM_BLOCKS: 1
NUM_CHANNELS: 32
HEAD_OFFSET:
BLOCK: ADAPTIVE
DILATION_RATE: 1
NUM_BLOCKS: 2
NUM_CHANNELS_PERKPT: 15
PRETRAINED_LAYERS: ['*']
STAGES:
BLOCK: ['BASIC', 'BASIC', 'BASIC']
FUSE_METHOD: ['SUM', 'SUM', 'SUM']
NUM_BLOCKS: [[4, 4], [4, 4, 4], [4, 4, 4, 4]]
NUM_BRANCHES: [2, 3, 4]
NUM_CHANNELS: [[32, 64], [32, 64, 128], [32, 64, 128, 256]]
NUM_MODULES: [1, 4, 3]
NUM_STAGES: 3
TAG_PER_JOINT: True
MULTIPROCESSING_DISTRIBUTED: True
NAME: regression
OUTPUT_DIR: output
PIN_MEMORY: True
PRINT_FREQ: 25
RANK: 0
RESCORE:
BATCHSIZE: 1024
DATA_FILE: data/rescore_data/rescore_dataset_train_exlpose_kpt
END_EPOCH: 20
GET_DATA: False
HIDDEN_LAYER: 256
LR: 0.001
MODEL_FILE: model/rescore/final_rescore_exlpose_kpt.pth
VALID: True
TEST:
ADJUST: True
ADJUST_THRESHOLD: 0.05
DECREASE: 0.9
DETECTION_THRESHOLD: 0.1
FLIP_TEST: True
GUASSIAN_KERNEL: 6
IGNORE_CENTER: True
IGNORE_TOO_MUCH: False
IMAGES_PER_GPU: 1
KEYPOINT_THRESHOLD: 0.01
LOG_PROGRESS: True
MATCH_HMP: False
MAX_ABSORB_DISTANCE: 75
MODEL_COMP_FILE:
MODEL_FILE:
NMS_KERNEL: 3
NMS_NUM_THRE: 8
NMS_PADDING: 1
NMS_THRE: 0.15
POOL_THRESHOLD1: 300
POOL_THRESHOLD2: 200
PROJECT2IMAGE: False
REFINE: True
SCALE_FACTOR: [1]
TAG_THRESHOLD: 1.0
USE_DETECTION_VAL: True
WITH_AE: True
WITH_HEATMAPS: True
TRAIN:
BEGIN_EPOCH: 0
CHECKPOINT:
END_EPOCH: 10
GAMMA1: 0.99
GAMMA2: 0.0
IMAGES_PER_GPU: 10
LR: 0.001
LR_FACTOR: 0.1
LR_STEP: [90, 120]
MAX_NUM_CENTERS: 15
MOMENTUM: 0.9
MULTI_SCALE: False
NESTEROV: False
OPTIMIZER: adam
RESUME: False
SEED: 42
SEMI_EPOCH: 140
SHUFFLE: True
STAGE: KA
STUDENT_THRESHOLD: 100.0
SUP_EPOCH: 20
TEACHER_THRESHOLD: 0.9
WD: 0.0001
VERBOSE: False
WORKERS: 8
Added key: store_based_barrier_key:1 to store for rank: 0
Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
=> classes: ['__background__', 'person']
init weights
=> init weights from normal distribution
=> loading pretrained model output/exlpose_kpt/hrnet_main/PT_stage_config/final_state0.pth.tar
Init KA training
Epoch: [0][0/207] Time: 21.199s (21.199s) Speed: 0.5 samples/s Data: 14.976s (14.976s) lsup: 1.724e-04 (1.724e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 1.724e-04 (1.724e-04)
Reducer buckets have been rebuilt in this iteration.
Epoch: [0][25/207] Time: 1.013s (1.807s) Speed: 9.9 samples/s Data: 0.000s (0.576s) lsup: 2.739e-04 (2.792e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.739e-04 (2.792e-04)
Epoch: [0][50/207] Time: 0.988s (1.416s) Speed: 10.1 samples/s Data: 0.000s (0.294s) lsup: 2.953e-04 (3.089e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.953e-04 (3.089e-04)
Epoch: [0][75/207] Time: 1.017s (1.287s) Speed: 9.8 samples/s Data: 0.000s (0.197s) lsup: 5.362e-04 (3.171e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 5.362e-04 (3.171e-04)
Epoch: [0][100/207] Time: 1.028s (1.217s) Speed: 9.7 samples/s Data: 0.000s (0.149s) lsup: 2.748e-04 (3.260e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.748e-04 (3.260e-04)
Epoch: [0][125/207] Time: 0.996s (1.177s) Speed: 10.0 samples/s Data: 0.000s (0.119s) lsup: 1.503e-04 (3.299e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 1.503e-04 (3.299e-04)
Epoch: [0][150/207] Time: 1.055s (1.151s) Speed: 9.5 samples/s Data: 0.000s (0.099s) lsup: 4.530e-04 (3.296e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.530e-04 (3.296e-04)
Epoch: [0][175/207] Time: 1.088s (1.132s) Speed: 9.2 samples/s Data: 0.000s (0.085s) lsup: 4.980e-04 (3.346e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.980e-04 (3.346e-04)
Epoch: [0][200/207] Time: 0.943s (1.114s) Speed: 10.6 samples/s Data: 0.000s (0.075s) lsup: 3.876e-04 (3.349e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.876e-04 (3.349e-04)
=> saving checkpoint to output/exlpose_kpt/hrnet_main/KA_stage_config
Epoch: [1][0/207] Time: 16.866s (16.866s) Speed: 0.6 samples/s Data: 15.747s (15.747s) lsup: 2.925e-04 (2.925e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.925e-04 (2.925e-04)
Epoch: [1][25/207] Time: 1.039s (1.672s) Speed: 9.6 samples/s Data: 0.000s (0.606s) lsup: 3.237e-04 (3.019e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.237e-04 (3.019e-04)
Epoch: [1][50/207] Time: 1.017s (1.349s) Speed: 9.8 samples/s Data: 0.000s (0.309s) lsup: 2.102e-04 (3.249e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.102e-04 (3.249e-04)
Epoch: [1][75/207] Time: 1.018s (1.237s) Speed: 9.8 samples/s Data: 0.000s (0.208s) lsup: 2.547e-04 (3.252e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.547e-04 (3.252e-04)
Epoch: [1][100/207] Time: 1.004s (1.179s) Speed: 10.0 samples/s Data: 0.000s (0.156s) lsup: 6.048e-04 (3.457e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 6.048e-04 (3.457e-04)
Epoch: [1][125/207] Time: 1.022s (1.147s) Speed: 9.8 samples/s Data: 0.000s (0.125s) lsup: 3.222e-04 (3.530e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.222e-04 (3.530e-04)
Epoch: [1][150/207] Time: 1.012s (1.124s) Speed: 9.9 samples/s Data: 0.000s (0.105s) lsup: 3.378e-04 (3.663e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.378e-04 (3.663e-04)
Epoch: [1][175/207] Time: 1.184s (1.110s) Speed: 8.4 samples/s Data: 0.000s (0.090s) lsup: 3.578e-04 (3.722e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.578e-04 (3.722e-04)
Epoch: [1][200/207] Time: 0.935s (1.094s) Speed: 10.7 samples/s Data: 0.000s (0.079s) lsup: 7.138e-04 (3.787e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 7.138e-04 (3.787e-04)
Epoch: [2][0/207] Time: 16.714s (16.714s) Speed: 0.6 samples/s Data: 15.623s (15.623s) lsup: 5.241e-04 (5.241e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 5.241e-04 (5.241e-04)
Epoch: [2][25/207] Time: 1.032s (1.663s) Speed: 9.7 samples/s Data: 0.000s (0.601s) lsup: 4.232e-04 (4.212e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.232e-04 (4.212e-04)
Epoch: [2][50/207] Time: 1.031s (1.350s) Speed: 9.7 samples/s Data: 0.000s (0.307s) lsup: 4.934e-04 (4.208e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.934e-04 (4.208e-04)
Epoch: [2][75/207] Time: 1.007s (1.240s) Speed: 9.9 samples/s Data: 0.000s (0.206s) lsup: 4.912e-04 (4.201e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.912e-04 (4.201e-04)
Epoch: [2][100/207] Time: 1.049s (1.183s) Speed: 9.5 samples/s Data: 0.000s (0.155s) lsup: 3.278e-04 (4.051e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.278e-04 (4.051e-04)
Epoch: [2][125/207] Time: 0.988s (1.151s) Speed: 10.1 samples/s Data: 0.000s (0.124s) lsup: 2.095e-04 (4.015e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.095e-04 (4.015e-04)
Epoch: [2][150/207] Time: 1.020s (1.129s) Speed: 9.8 samples/s Data: 0.000s (0.104s) lsup: 3.681e-04 (3.985e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.681e-04 (3.985e-04)
Epoch: [2][175/207] Time: 0.992s (1.113s) Speed: 10.1 samples/s Data: 0.000s (0.089s) lsup: 4.917e-04 (3.967e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.917e-04 (3.967e-04)
Epoch: [2][200/207] Time: 0.934s (1.097s) Speed: 10.7 samples/s Data: 0.000s (0.078s) lsup: 3.803e-04 (3.941e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.803e-04 (3.941e-04)
Epoch: [3][0/207] Time: 17.278s (17.278s) Speed: 0.6 samples/s Data: 16.187s (16.187s) lsup: 4.053e-04 (4.053e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.053e-04 (4.053e-04)
Epoch: [3][25/207] Time: 1.003s (1.694s) Speed: 10.0 samples/s Data: 0.000s (0.623s) lsup: 5.318e-04 (3.912e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 5.318e-04 (3.912e-04)
Epoch: [3][50/207] Time: 1.011s (1.362s) Speed: 9.9 samples/s Data: 0.000s (0.318s) lsup: 3.888e-04 (3.658e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.888e-04 (3.658e-04)
Epoch: [3][75/207] Time: 1.006s (1.247s) Speed: 9.9 samples/s Data: 0.000s (0.213s) lsup: 3.907e-04 (3.753e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.907e-04 (3.753e-04)
Epoch: [3][100/207] Time: 1.013s (1.189s) Speed: 9.9 samples/s Data: 0.000s (0.161s) lsup: 3.921e-04 (3.816e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.921e-04 (3.816e-04)
Epoch: [3][125/207] Time: 1.000s (1.154s) Speed: 10.0 samples/s Data: 0.000s (0.129s) lsup: 4.908e-04 (3.784e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.908e-04 (3.784e-04)
Epoch: [3][150/207] Time: 1.017s (1.131s) Speed: 9.8 samples/s Data: 0.000s (0.108s) lsup: 2.881e-04 (3.716e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.881e-04 (3.716e-04)
Epoch: [3][175/207] Time: 1.008s (1.114s) Speed: 9.9 samples/s Data: 0.000s (0.092s) lsup: 4.187e-04 (3.708e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.187e-04 (3.708e-04)
Epoch: [3][200/207] Time: 0.945s (1.099s) Speed: 10.6 samples/s Data: 0.000s (0.081s) lsup: 3.205e-04 (3.741e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.205e-04 (3.741e-04)
Epoch: [4][0/207] Time: 16.245s (16.245s) Speed: 0.6 samples/s Data: 15.181s (15.181s) lsup: 5.034e-04 (5.034e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 5.034e-04 (5.034e-04)
Epoch: [4][25/207] Time: 1.038s (1.655s) Speed: 9.6 samples/s Data: 0.000s (0.584s) lsup: 2.679e-04 (3.924e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.679e-04 (3.924e-04)
Epoch: [4][50/207] Time: 1.035s (1.346s) Speed: 9.7 samples/s Data: 0.000s (0.298s) lsup: 4.718e-04 (3.861e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.718e-04 (3.861e-04)
Epoch: [4][75/207] Time: 0.990s (1.236s) Speed: 10.1 samples/s Data: 0.000s (0.200s) lsup: 4.560e-04 (3.933e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.560e-04 (3.933e-04)
Epoch: [4][100/207] Time: 1.009s (1.184s) Speed: 9.9 samples/s Data: 0.000s (0.151s) lsup: 3.529e-04 (4.047e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.529e-04 (4.047e-04)
Epoch: [4][125/207] Time: 1.006s (1.150s) Speed: 9.9 samples/s Data: 0.000s (0.121s) lsup: 4.301e-04 (3.967e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.301e-04 (3.967e-04)
Epoch: [4][150/207] Time: 1.041s (1.128s) Speed: 9.6 samples/s Data: 0.000s (0.101s) lsup: 3.382e-04 (4.020e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.382e-04 (4.020e-04)
Epoch: [4][175/207] Time: 0.999s (1.113s) Speed: 10.0 samples/s Data: 0.000s (0.087s) lsup: 4.364e-04 (4.013e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.364e-04 (4.013e-04)
Epoch: [4][200/207] Time: 0.947s (1.098s) Speed: 10.6 samples/s Data: 0.000s (0.076s) lsup: 4.655e-04 (3.991e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.655e-04 (3.991e-04)
Epoch: [5][0/207] Time: 17.489s (17.489s) Speed: 0.6 samples/s Data: 16.434s (16.434s) lsup: 3.967e-04 (3.967e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.967e-04 (3.967e-04)
Epoch: [5][25/207] Time: 1.029s (1.710s) Speed: 9.7 samples/s Data: 0.000s (0.632s) lsup: 4.002e-04 (3.675e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.002e-04 (3.675e-04)
Epoch: [5][50/207] Time: 1.007s (1.368s) Speed: 9.9 samples/s Data: 0.000s (0.323s) lsup: 2.647e-04 (3.837e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.647e-04 (3.837e-04)
Epoch: [5][75/207] Time: 0.999s (1.252s) Speed: 10.0 samples/s Data: 0.000s (0.217s) lsup: 4.906e-04 (3.846e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.906e-04 (3.846e-04)
Epoch: [5][100/207] Time: 1.020s (1.194s) Speed: 9.8 samples/s Data: 0.000s (0.163s) lsup: 4.875e-04 (3.887e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.875e-04 (3.887e-04)
Epoch: [5][125/207] Time: 1.024s (1.159s) Speed: 9.8 samples/s Data: 0.000s (0.131s) lsup: 3.891e-04 (3.847e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.891e-04 (3.847e-04)
Epoch: [5][150/207] Time: 1.052s (1.137s) Speed: 9.5 samples/s Data: 0.000s (0.109s) lsup: 4.747e-04 (3.849e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.747e-04 (3.849e-04)
Epoch: [5][175/207] Time: 1.123s (1.121s) Speed: 8.9 samples/s Data: 0.000s (0.094s) lsup: 5.802e-04 (3.916e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 5.802e-04 (3.916e-04)
Epoch: [5][200/207] Time: 0.937s (1.104s) Speed: 10.7 samples/s Data: 0.000s (0.082s) lsup: 4.508e-04 (3.909e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.508e-04 (3.909e-04)
Epoch: [6][0/207] Time: 17.037s (17.037s) Speed: 0.6 samples/s Data: 15.794s (15.794s) lsup: 3.519e-04 (3.519e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.519e-04 (3.519e-04)
Epoch: [6][25/207] Time: 1.038s (1.685s) Speed: 9.6 samples/s Data: 0.000s (0.608s) lsup: 3.343e-04 (3.548e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.343e-04 (3.548e-04)
Epoch: [6][50/207] Time: 1.004s (1.354s) Speed: 10.0 samples/s Data: 0.000s (0.310s) lsup: 3.328e-04 (3.548e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.328e-04 (3.548e-04)
Epoch: [6][75/207] Time: 1.021s (1.245s) Speed: 9.8 samples/s Data: 0.000s (0.208s) lsup: 6.638e-04 (3.578e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 6.638e-04 (3.578e-04)
Epoch: [6][100/207] Time: 1.009s (1.187s) Speed: 9.9 samples/s Data: 0.000s (0.157s) lsup: 2.354e-04 (3.589e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.354e-04 (3.589e-04)
Epoch: [6][125/207] Time: 1.007s (1.152s) Speed: 9.9 samples/s Data: 0.000s (0.126s) lsup: 3.856e-04 (3.627e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.856e-04 (3.627e-04)
Epoch: [6][150/207] Time: 1.005s (1.131s) Speed: 9.9 samples/s Data: 0.000s (0.105s) lsup: 2.775e-04 (3.703e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.775e-04 (3.703e-04)
Epoch: [6][175/207] Time: 0.997s (1.114s) Speed: 10.0 samples/s Data: 0.000s (0.090s) lsup: 2.545e-04 (3.729e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.545e-04 (3.729e-04)
Epoch: [6][200/207] Time: 0.944s (1.098s) Speed: 10.6 samples/s Data: 0.000s (0.079s) lsup: 4.128e-04 (3.834e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.128e-04 (3.834e-04)
Epoch: [7][0/207] Time: 17.504s (17.504s) Speed: 0.6 samples/s Data: 15.984s (15.984s) lsup: 2.928e-04 (2.928e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.928e-04 (2.928e-04)
Epoch: [7][25/207] Time: 1.018s (1.695s) Speed: 9.8 samples/s Data: 0.000s (0.615s) lsup: 2.325e-04 (3.764e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.325e-04 (3.764e-04)
Epoch: [7][50/207] Time: 1.030s (1.362s) Speed: 9.7 samples/s Data: 0.001s (0.314s) lsup: 3.711e-04 (3.972e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.711e-04 (3.972e-04)
Epoch: [7][75/207] Time: 1.022s (1.250s) Speed: 9.8 samples/s Data: 0.000s (0.211s) lsup: 3.325e-04 (3.823e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.325e-04 (3.823e-04)
Epoch: [7][100/207] Time: 1.015s (1.191s) Speed: 9.9 samples/s Data: 0.000s (0.159s) lsup: 3.803e-04 (3.808e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.803e-04 (3.808e-04)
Epoch: [7][125/207] Time: 1.015s (1.156s) Speed: 9.9 samples/s Data: 0.000s (0.127s) lsup: 4.398e-04 (3.806e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.398e-04 (3.806e-04)
Epoch: [7][150/207] Time: 1.030s (1.134s) Speed: 9.7 samples/s Data: 0.000s (0.106s) lsup: 4.581e-04 (3.785e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.581e-04 (3.785e-04)
Epoch: [7][175/207] Time: 1.010s (1.118s) Speed: 9.9 samples/s Data: 0.000s (0.091s) lsup: 4.852e-04 (3.785e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.852e-04 (3.785e-04)
Epoch: [7][200/207] Time: 0.944s (1.103s) Speed: 10.6 samples/s Data: 0.000s (0.080s) lsup: 4.063e-04 (3.803e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.063e-04 (3.803e-04)
Epoch: [8][0/207] Time: 17.594s (17.594s) Speed: 0.6 samples/s Data: 16.118s (16.118s) lsup: 2.852e-04 (2.852e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.852e-04 (2.852e-04)
Epoch: [8][25/207] Time: 1.007s (1.693s) Speed: 9.9 samples/s Data: 0.000s (0.620s) lsup: 4.150e-04 (3.500e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.150e-04 (3.500e-04)
Epoch: [8][50/207] Time: 1.006s (1.358s) Speed: 9.9 samples/s Data: 0.000s (0.316s) lsup: 3.949e-04 (3.604e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.949e-04 (3.604e-04)
Epoch: [8][75/207] Time: 1.022s (1.248s) Speed: 9.8 samples/s Data: 0.000s (0.212s) lsup: 2.235e-04 (3.746e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.235e-04 (3.746e-04)
Epoch: [8][100/207] Time: 1.006s (1.189s) Speed: 9.9 samples/s Data: 0.000s (0.160s) lsup: 3.158e-04 (3.827e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.158e-04 (3.827e-04)
Epoch: [8][125/207] Time: 1.037s (1.155s) Speed: 9.6 samples/s Data: 0.000s (0.128s) lsup: 4.432e-04 (3.860e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.432e-04 (3.860e-04)
Epoch: [8][150/207] Time: 0.998s (1.136s) Speed: 10.0 samples/s Data: 0.000s (0.107s) lsup: 4.838e-04 (3.888e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.838e-04 (3.888e-04)
Epoch: [8][175/207] Time: 1.007s (1.118s) Speed: 9.9 samples/s Data: 0.000s (0.092s) lsup: 4.374e-04 (3.937e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.374e-04 (3.937e-04)
Epoch: [8][200/207] Time: 0.940s (1.103s) Speed: 10.6 samples/s Data: 0.000s (0.080s) lsup: 2.457e-04 (3.913e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 2.457e-04 (3.913e-04)
Epoch: [9][0/207] Time: 16.965s (16.965s) Speed: 0.6 samples/s Data: 15.899s (15.899s) lsup: 3.692e-04 (3.692e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.692e-04 (3.692e-04)
Epoch: [9][25/207] Time: 1.013s (1.669s) Speed: 9.9 samples/s Data: 0.000s (0.612s) lsup: 3.115e-04 (3.443e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.115e-04 (3.443e-04)
Epoch: [9][50/207] Time: 1.017s (1.348s) Speed: 9.8 samples/s Data: 0.000s (0.312s) lsup: 4.662e-04 (3.820e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.662e-04 (3.820e-04)
Epoch: [9][75/207] Time: 1.008s (1.240s) Speed: 9.9 samples/s Data: 0.000s (0.210s) lsup: 3.342e-04 (3.691e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.342e-04 (3.691e-04)
Epoch: [9][100/207] Time: 1.015s (1.183s) Speed: 9.9 samples/s Data: 0.000s (0.158s) lsup: 3.091e-04 (3.663e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.091e-04 (3.663e-04)
Epoch: [9][125/207] Time: 1.013s (1.149s) Speed: 9.9 samples/s Data: 0.000s (0.126s) lsup: 3.104e-04 (3.690e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.104e-04 (3.690e-04)
Epoch: [9][150/207] Time: 1.008s (1.127s) Speed: 9.9 samples/s Data: 0.000s (0.106s) lsup: 3.988e-04 (3.712e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.988e-04 (3.712e-04)
Epoch: [9][175/207] Time: 1.024s (1.111s) Speed: 9.8 samples/s Data: 0.000s (0.091s) lsup: 3.964e-04 (3.703e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 3.964e-04 (3.703e-04)
Epoch: [9][200/207] Time: 0.941s (1.097s) Speed: 10.6 samples/s Data: 0.000s (0.079s) lsup: 4.060e-04 (3.721e-04) lunsup: 0.000e+00 (0.000e+00) total_loss: 4.060e-04 (3.721e-04)
saving final model state to output/exlpose_kpt/hrnet_main/KA_stage_config/final_state0.pth.tar
In KA stage training,
lunsup
term is always zero in log. When I used this final checkpoint after training, the performance is abnormal.