Closed mary-0830 closed 1 year ago
second logs:
[09/05 15:29:54] detectron2 INFO: Rank of current process: 0. World size: 1
[09/05 15:29:58] detectron2 INFO: Environment info:
---------------------- -----------------------------------------------------------------------------------
sys.platform linux
Python 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
numpy 1.22.4
detectron2 0.6 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/detectron2
Compiler GCC 7.3
CUDA compiler CUDA 11.1
detectron2 arch flags 3.7, 5.0, 5.2, 6.0, 6.1, 7.0, 7.5, 8.0, 8.6
DETECTRON2_ENV_MODULE <not set>
PyTorch 1.8.0+cu111 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/torch
PyTorch debug build False
GPU available Yes
GPU 0 NVIDIA A100-PCIE-80GB (arch=8.0)
Driver version 470.42.01
CUDA_HOME /usr/local/cuda-11.4
Pillow 8.2.0
torchvision 0.9.0+cu111 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore 0.1.5.post20220512
iopath 0.1.9
cv2 4.5.5
---------------------- -----------------------------------------------------------------------------------
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
- CuDNN 8.0.5
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
[09/05 15:29:58] detectron2 INFO: Command line arguments: Namespace(config_file='configs/yolox/yolox_s_person_ours.yaml', dist_url='tcp://127.0.0.1:50158', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', 'output/yolox_s_person/model_0029999.pth'], resume=False)
[09/05 15:29:58] detectron2 INFO: Contents of args.config_file=configs/yolox/yolox_s_person_ours.yaml:
_BASE_: "Base-YOLO.yaml"
MODEL:
PIXEL_MEAN: [0.485, 0.456, 0.406] # same value as PP-YOLOv2, RGB order
PIXEL_STD: [0.229, 0.224, 0.225]
WEIGHTS: ""
MASK_ON: False
META_ARCHITECTURE: "YOLOX"
BACKBONE:
NAME: "build_cspdarknetx_backbone"
DARKNET:
WEIGHTS: ""
DEPTH_WISE: False
OUT_FEATURES: ["dark3", "dark4", "dark5"]
YOLO:
CLASSES: 1
IN_FEATURES: ["dark3", "dark4", "dark5"]
CONF_THRESHOLD: 0.001
NMS_THRESHOLD: 0.65
IGNORE_THRESHOLD: 0.7
WIDTH_MUL: 0.50
DEPTH_MUL: 0.33
# LOSS_TYPE: "v7"
LOSS:
LAMBDA_IOU: 1.5
DATASETS:
DATASET_ROOT: 'datasets/person_od'
ANN_ROOT: 'datasets/person_od/annotations'
TRAIN_IMAGE_PATH: 'train/images'
VAL_IMAGE_PATH: 'val_ours/images'
TRAIN_JSON_NAME: 'instances_train_person_od.json'
VAL_JSON_NAME: 'instances_val_ours.json'
TRAIN: ("person_train",)
TEST: ("person_val",)
INPUT:
# FORMAT: "RGB" # using BGR default
MIN_SIZE_TRAIN: (416, 512, 608, 768)
MAX_SIZE_TRAIN: 800 # force max size train to 800?
MIN_SIZE_TEST: 640
MAX_SIZE_TEST: 800
# open all augmentations
JITTER_CROP:
ENABLED: False
RESIZE:
ENABLED: True
# SHAPE: (540, 960)
DISTORTION:
ENABLED: True
COLOR_JITTER:
BRIGHTNESS: True
SATURATION: True
# MOSAIC:
# ENABLED: True
# NUM_IMAGES: 4
# DEBUG_VIS: True
# # MOSAIC_WIDTH: 960
# # MOSAIC_HEIGHT: 540
MOSAIC_AND_MIXUP:
ENABLED: True
# ENABLED: False
DEBUG_VIS: False
ENABLE_MIXUP: False
DISABLE_AT_ITER: 120000
SOLVER:
# enable fp16 training
AMP:
ENABLED: true
IMS_PER_BATCH: 512
BASE_LR: 0.001
STEPS: (60000, 80000)
WARMUP_FACTOR: 0.00033333
WARMUP_ITERS: 1200
MAX_ITER: 120000
LR_SCHEDULER_NAME: "WarmupCosineLR"
# REFERENCE_WORLD_SIZE: 0
TEST:
EVAL_PERIOD: 1000
# EVAL_PERIOD: 0
OUTPUT_DIR: "output/yolox_s_person_ours"
VIS_PERIOD: 1000
DATALOADER:
# proposals are part of the dataset_dicts, and take a lot of RAM
NUM_WORKERS: 1
TEST_NUM_WORKERS: 1
[09/05 15:29:58] detectron2 INFO: Running with full config:
CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: true
NUM_WORKERS: 1
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
TEST_NUM_WORKERS: 1
DATASETS:
ANN_ROOT: datasets/person_od/annotations
CLASS_NAMES: []
DATASET_ROOT: datasets/person_od
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:
- person_val
TRAIN:
- person_train
TRAIN_IMAGE_PATH: train/images
TRAIN_JSON_NAME: instances_train_person_od.json
VAL_IMAGE_PATH: val_ours/images
VAL_JSON_NAME: instances_val_ours.json
GLOBAL:
HACK: 1.0
INPUT:
COLOR_JITTER:
BRIGHTNESS: true
LIGHTING: false
SATURATION: true
CROP:
ENABLED: false
SIZE:
- 0.9
- 0.9
TYPE: relative_range
DISTORTION:
ENABLED: true
EXPOSURE: 1.5
HUE: 0.1
SATURATION: 1.5
FORMAT: BGR
GRID_MASK:
ENABLED: false
MODE: 1
PROB: 0.3
USE_HEIGHT: true
USE_WIDTH: true
INPUT_SIZE:
- 640
- 640
JITTER_CROP:
ENABLED: false
JITTER_RATIO: 0.3
MASK_FORMAT: polygon
MAX_SIZE_TEST: 800
MAX_SIZE_TRAIN: 800
MIN_SIZE_TEST: 640
MIN_SIZE_TRAIN:
- 416
- 512
- 608
- 768
MIN_SIZE_TRAIN_SAMPLING: choice
MOSAIC:
DEBUG_VIS: false
ENABLED: false
MIN_OFFSET: 0.2
MOSAIC_HEIGHT: 640
MOSAIC_WIDTH: 640
NUM_IMAGES: 4
POOL_CAPACITY: 1000
MOSAIC_AND_MIXUP:
DEBUG_VIS: false
DEGREES: 10.0
DISABLE_AT_ITER: 120000
ENABLED: true
ENABLE_MIXUP: false
MOSAIC_HEIGHT_RANGE: &id001
- 512
- 800
MOSAIC_WIDTH_RANGE: *id001
MSCALE:
- 0.5
- 1.5
NUM_IMAGES: 4
PERSPECTIVE: 0.0
POOL_CAPACITY: 1000
SCALE:
- 0.5
- 1.5
SHEAR: 2.0
TRANSLATE: 0.1
RANDOM_FLIP: horizontal
RESIZE:
ENABLED: true
SCALE_JITTER:
- 0.8
- 1.2
SHAPE:
- 640
- 640
TEST_SHAPE:
- 608
- 608
SHIFT:
SHIFT_PIXELS: 32
MODEL:
ANCHOR_GENERATOR:
ANGLES:
- - -90
- 0
- 90
ASPECT_RATIOS:
- - 0.5
- 1.0
- 2.0
NAME: DefaultAnchorGenerator
OFFSET: 0.0
SIZES:
- - 32
- 64
- 128
- 256
- 512
BACKBONE:
CHANNEL: 0
FREEZE_AT: 2
NAME: build_cspdarknetx_backbone
SIMPLE: false
STRIDE: 1
BIFPN:
NORM: GN
NUM_BIFPN: 6
NUM_LEVELS: 5
OUT_CHANNELS: 160
SEPARABLE_CONV: false
DARKNET:
DEPTH: 53
DEPTH_WISE: false
NORM: BN
OUT_FEATURES:
- dark3
- dark4
- dark5
RES5_DILATION: 1
STEM_OUT_CHANNELS: 32
WEIGHTS: ''
WITH_CSP: true
DETR:
ATTENTION_TYPE: DETR
BBOX_EMBED_NUM_LAYERS: 3
CENTERED_POSITION_ENCODIND: false
CLS_WEIGHT: 1.0
DECODER_BLOCK_GRAD: true
DEC_LAYERS: 6
DEEP_SUPERVISION: true
DEFORMABLE: false
DIM_FEEDFORWARD: 2048
DROPOUT: 0.1
ENC_LAYERS: 6
FROZEN_WEIGHTS: ''
GIOU_WEIGHT: 2.0
HIDDEN_DIM: 256
L1_WEIGHT: 5.0
NHEADS: 8
NO_OBJECT_WEIGHT: 0.1
NUM_CLASSES: 80
NUM_FEATURE_LEVELS: 1
NUM_OBJECT_QUERIES: 100
NUM_QUERY_PATTERN: 3
NUM_QUERY_POSITION: 300
PRE_NORM: false
SPATIAL_PRIOR: learned
TWO_STAGE: false
USE_FOCAL_LOSS: false
WITH_BOX_REFINE: false
DEVICE: cuda
EFFICIENTNET:
FEATURE_INDICES:
- 1
- 4
- 10
- 15
NAME: efficientnet_b0
OUT_FEATURES:
- stride4
- stride8
- stride16
- stride32
PRETRAINED: true
FBNET_V2:
ARCH: default
ARCH_DEF: []
NORM: bn
NORM_ARGS: []
OUT_FEATURES:
- trunk3
SCALE_FACTOR: 1.0
STEM_IN_CHANNELS: 3
WIDTH_DIVISOR: 1
FPN:
FUSE_TYPE: sum
IN_FEATURES: []
NORM: ''
OUT_CHANNELS: 256
OUT_CHANNELS_LIST:
- 256
- 512
- 1024
REPEAT: 2
KEYPOINT_ON: false
LOAD_PROPOSALS: false
MASK_ON: false
META_ARCHITECTURE: YOLOX
NMS_TYPE: normal
ONNX_EXPORT: false
PADDED_VALUE: 114.0
PANOPTIC_FPN:
COMBINE:
ENABLED: true
INSTANCES_CONFIDENCE_THRESH: 0.5
OVERLAP_THRESH: 0.5
STUFF_AREA_LIMIT: 4096
INSTANCE_LOSS_WEIGHT: 1.0
PIXEL_MEAN:
- 0.485
- 0.456
- 0.406
PIXEL_STD:
- 0.229
- 0.224
- 0.225
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
REGNETS:
OUT_FEATURES:
- s2
- s3
- s4
TYPE: x
RESNETS:
DEFORM_MODULATED: false
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE:
- false
- false
- false
- false
DEPTH: 50
NORM: FrozenBN
NUM_GROUPS: 1
OUT_FEATURES:
- res4
R2TYPE: res2net50_v1d
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: true
WIDTH_PER_GROUP: 64
RETINANET:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_WEIGHTS: &id003
- 1.0
- 1.0
- 1.0
- 1.0
FOCAL_LOSS_ALPHA: 0.25
FOCAL_LOSS_GAMMA: 2.0
IN_FEATURES:
- p3
- p4
- p5
- p6
- p7
IOU_LABELS:
- 0
- -1
- 1
IOU_THRESHOLDS:
- 0.4
- 0.5
NMS_THRESH_TEST: 0.5
NORM: ''
NUM_CLASSES: 80
NUM_CONVS: 4
PRIOR_PROB: 0.01
SCORE_THRESH_TEST: 0.05
SMOOTH_L1_LOSS_BETA: 0.1
TOPK_CANDIDATES_TEST: 1000
ROI_BOX_CASCADE_HEAD:
BBOX_REG_WEIGHTS:
- &id002
- 10.0
- 10.0
- 5.0
- 5.0
- - 20.0
- 20.0
- 10.0
- 10.0
- - 30.0
- 30.0
- 15.0
- 15.0
IOUS:
- 0.5
- 0.6
- 0.7
ROI_BOX_HEAD:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: *id002
CLS_AGNOSTIC_BBOX_REG: false
CONV_DIM: 256
FC_DIM: 1024
NAME: ''
NORM: ''
NUM_CONV: 0
NUM_FC: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
SMOOTH_L1_BETA: 0.0
TRAIN_ON_PRED_BOXES: false
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
IN_FEATURES:
- res4
IOU_LABELS:
- 0
- 1
IOU_THRESHOLDS:
- 0.5
NAME: Res5ROIHeads
NMS_THRESH_TEST: 0.5
NUM_CLASSES: 80
POSITIVE_FRACTION: 0.25
PROPOSAL_APPEND_GT: true
SCORE_THRESH_TEST: 0.05
ROI_KEYPOINT_HEAD:
CONV_DIMS:
- 512
- 512
- 512
- 512
- 512
- 512
- 512
- 512
LOSS_WEIGHT: 1.0
MIN_KEYPOINTS_PER_IMAGE: 1
NAME: KRCNNConvDeconvUpsampleHead
NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
NUM_KEYPOINTS: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
ROI_MASK_HEAD:
CLS_AGNOSTIC_MASK: false
CONV_DIM: 256
NAME: MaskRCNNConvUpsampleHead
NORM: ''
NUM_CONV: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
RPN:
BATCH_SIZE_PER_IMAGE: 256
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: *id003
BOUNDARY_THRESH: -1
CONV_DIMS:
- -1
HEAD_NAME: StandardRPNHead
IN_FEATURES:
- res4
IOU_LABELS:
- 0
- -1
- 1
IOU_THRESHOLDS:
- 0.3
- 0.7
LOSS_WEIGHT: 1.0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOPK_TEST: 1000
POST_NMS_TOPK_TRAIN: 2000
PRE_NMS_TOPK_TEST: 6000
PRE_NMS_TOPK_TRAIN: 12000
SMOOTH_L1_BETA: 0.0
SEM_SEG_HEAD:
COMMON_STRIDE: 4
CONVS_DIM: 128
IGNORE_VALUE: 255
IN_FEATURES:
- p2
- p3
- p4
- p5
LOSS_WEIGHT: 1.0
NAME: SemSegFPNHead
NORM: GN
NUM_CLASSES: 54
SOLOV2:
FPN_INSTANCE_STRIDES:
- 8
- 8
- 16
- 32
- 32
FPN_SCALE_RANGES:
- - 1
- 96
- - 48
- 192
- - 96
- 384
- - 192
- 768
- - 384
- 2048
INSTANCE_CHANNELS: 512
INSTANCE_IN_CHANNELS: 256
INSTANCE_IN_FEATURES:
- p2
- p3
- p4
- p5
- p6
LOSS:
DICE_WEIGHT: 3.0
FOCAL_ALPHA: 0.25
FOCAL_GAMMA: 2.0
FOCAL_USE_SIGMOID: true
FOCAL_WEIGHT: 1.0
MASK_CHANNELS: 128
MASK_IN_CHANNELS: 256
MASK_IN_FEATURES:
- p2
- p3
- p4
- p5
MASK_THR: 0.5
MAX_PER_IMG: 100
NMS_KERNEL: gaussian
NMS_PRE: 500
NMS_SIGMA: 2
NMS_TYPE: matrix
NORM: GN
NUM_CLASSES: 80
NUM_GRIDS:
- 40
- 36
- 24
- 16
- 12
NUM_INSTANCE_CONVS: 4
NUM_KERNELS: 256
NUM_MASKS: 256
PRIOR_PROB: 0.01
SCORE_THR: 0.1
SIGMA: 0.2
TYPE_DCN: DCN
UPDATE_THR: 0.05
USE_COORD_CONV: true
USE_DCN_IN_INSTANCE: false
SWIN:
DEPTHS:
- 2
- 2
- 6
- 2
OUT_FEATURES:
- 1
- 2
- 3
PATCH: 4
TYPE: tiny
WEIGHTS: ''
WINDOW: 7
VT_FPN:
HEADS: 16
IN_FEATURES:
- res2
- res3
- res4
- res5
LAYERS: 3
MIN_GROUP_PLANES: 64
NORM: BN
OUT_CHANNELS: 256
POS_HWS: []
POS_N_DOWNSAMPLE: []
TOKEN_C: 1024
TOKEN_LS:
- 16
- 16
- 8
- 8
WEIGHTS: output/yolox_s_person/model_0029999.pth
YOLO:
ANCHORS:
- - - 116
- 90
- - 156
- 198
- - 373
- 326
- - - 30
- 61
- - 62
- 45
- - 42
- 119
- - - 10
- 13
- - 16
- 30
- - 33
- 23
ANCHOR_MASK: []
BRANCH_DILATIONS:
- 1
- 2
- 3
CLASSES: 1
CONF_THRESHOLD: 0.001
DEPTH_MUL: 0.33
IGNORE_THRESHOLD: 0.7
IN_FEATURES:
- dark3
- dark4
- dark5
IOU_TYPE: ciou
LOSS:
ANCHOR_RATIO_THRESH: 4.0
BUILD_TARGET_TYPE: default
LAMBDA_CLS: 1.0
LAMBDA_CONF: 1.0
LAMBDA_IOU: 1.5
LAMBDA_WH: 1.0
LAMBDA_XY: 1.0
USE_L1: true
LOSS_TYPE: v4
MAX_BOXES_NUM: 100
NECK:
TYPE: yolov3
WITH_SPP: false
NMS_THRESHOLD: 0.65
NUM_BRANCH: 3
ORIEN_HEAD:
UP_CHANNELS: 64
TEST_BRANCH_IDX: 1
VARIANT: yolov3
WIDTH_MUL: 0.5
OUTPUT_DIR: output/yolox_s_person_ours
SEED: -1
SOLVER:
AMP:
ENABLED: true
AUTO_SCALING_METHODS:
- default_scale_d2_configs
- default_scale_quantization_configs
BACKBONE_MULTIPLIER: 0.1
BASE_LR: 0.001
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 5000
CLIP_GRADIENTS:
CLIP_TYPE: value
CLIP_VALUE: 1.0
ENABLED: false
NORM_TYPE: 2.0
GAMMA: 0.1
IMS_PER_BATCH: 512
LR_MULTIPLIER_OVERWRITE: []
LR_SCHEDULER:
GAMMA: 0.1
MAX_EPOCH: 500
MAX_ITER: 40000
NAME: WarmupMultiStepLR
STEPS:
- 30000
WARMUP_FACTOR: 0.001
WARMUP_ITERS: 1000
WARMUP_METHOD: linear
LR_SCHEDULER_NAME: WarmupCosineLR
MAX_ITER: 120000
MOMENTUM: 0.9
NESTEROV: false
OPTIMIZER: ADAMW
REFERENCE_WORLD_SIZE: 8
STEPS:
- 60000
- 80000
WARMUP_FACTOR: 0.00033333
WARMUP_ITERS: 1200
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: null
WEIGHT_DECAY_EMBED: 0.0
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: false
FLIP: true
MAX_SIZE: 4000
MIN_SIZES:
- 400
- 500
- 600
- 700
- 800
- 900
- 1000
- 1100
- 1200
DETECTIONS_PER_IMAGE: 100
EVAL_PERIOD: 1000
EXPECTED_RESULTS: []
KEYPOINT_OKS_SIGMAS: []
PRECISE_BN:
ENABLED: false
NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 1000
[09/05 15:29:58] detectron2 INFO: Full config saved to output/yolox_s_person_ours/config.yaml
[09/05 15:29:59] d2.utils.env INFO: Using a generated random seed 59113710
[09/05 15:30:13] d2.engine.defaults INFO: Model:
YOLOX(
(backbone): CSPDarknet(
(stem): Focus(
(conv): BaseConv(
(conv): Conv2d(12, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(dark2): Sequential(
(0): BaseConv(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(dark3): Sequential(
(0): BaseConv(
(conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(dark4): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(dark5): Sequential(
(0): BaseConv(
(conv): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): SPPBottleneck(
(conv1): BaseConv(
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): ModuleList(
(0): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False)
(1): MaxPool2d(kernel_size=9, stride=1, padding=4, dilation=1, ceil_mode=False)
(2): MaxPool2d(kernel_size=13, stride=1, padding=6, dilation=1, ceil_mode=False)
)
(conv2): BaseConv(
(conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
)
(neck): YOLOPAFPN(
(upsample): Upsample(scale_factor=2.0, mode=nearest)
(lateral_conv0): BaseConv(
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_p4): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(reduce_conv1): BaseConv(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_p3): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(bu_conv2): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_n3): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(bu_conv1): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_n4): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(head): YOLOXHead(
(cls_convs): ModuleList(
(0): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
(reg_convs): ModuleList(
(0): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Sequential(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
(cls_preds): ModuleList(
(0): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
)
(reg_preds): ModuleList(
(0): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1))
)
(obj_preds): ModuleList(
(0): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
)
(stems): ModuleList(
(0): BaseConv(
(conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(2): BaseConv(
(conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(l1_loss): L1Loss()
(bcewithlog_loss): BCEWithLogitsLoss()
(iou_loss): IOUloss()
)
)
[09/05 15:30:13] fvcore.common.checkpoint INFO: [Checkpointer] Loading from output/yolox_s_person/model_0029999.pth ...
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.0.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.0.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.1.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.1.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.2.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.2.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint:
[34mhead.cls_preds.0.{bias, weight}[0m
[34mhead.cls_preds.1.{bias, weight}[0m
[34mhead.cls_preds.2.{bias, weight}[0m
[09/05 15:30:14] d2.data.datasets.coco INFO: Loaded 1000 images in COCO format from datasets/person_od/annotations/instances_val_ours.json
[09/05 15:30:14] d2.data.build INFO: Distribution of instances among all 1 categories:
[36m| category | #instances |
|:----------:|:-------------|
| person | 3888 |
| | |[0m
[09/05 15:30:14] d2.data.dataset_mapper INFO: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(640, 640), max_size=800, sample_style='choice')]
[09/05 15:30:14] d2.data.common INFO: Serializing 1000 elements to byte tensors and concatenating them all ...
[09/05 15:30:14] d2.data.common INFO: Serialized dataset takes 0.53 MiB
[09/05 15:30:14] d2.evaluation.evaluator INFO: Start inference on 1000 batches
[09/05 15:30:16] d2.evaluation.evaluator INFO: Inference done 11/1000. Dataloading: 0.0413 s/iter. Inference: 0.0404 s/iter. Eval: 0.0003 s/iter. Total: 0.0820 s/iter. ETA=0:01:21
[09/05 15:30:21] d2.evaluation.evaluator INFO: Inference done 57/1000. Dataloading: 0.0647 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1076 s/iter. ETA=0:01:41
[09/05 15:30:26] d2.evaluation.evaluator INFO: Inference done 96/1000. Dataloading: 0.0732 s/iter. Inference: 0.0434 s/iter. Eval: 0.0003 s/iter. Total: 0.1170 s/iter. ETA=0:01:45
[09/05 15:30:31] d2.evaluation.evaluator INFO: Inference done 148/1000. Dataloading: 0.0661 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1095 s/iter. ETA=0:01:33
[09/05 15:30:36] d2.evaluation.evaluator INFO: Inference done 196/1000. Dataloading: 0.0653 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1087 s/iter. ETA=0:01:27
[09/05 15:30:41] d2.evaluation.evaluator INFO: Inference done 247/1000. Dataloading: 0.0634 s/iter. Inference: 0.0428 s/iter. Eval: 0.0003 s/iter. Total: 0.1066 s/iter. ETA=0:01:20
[09/05 15:30:46] d2.evaluation.evaluator INFO: Inference done 298/1000. Dataloading: 0.0622 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1052 s/iter. ETA=0:01:13
[09/05 15:30:51] d2.evaluation.evaluator INFO: Inference done 342/1000. Dataloading: 0.0635 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1064 s/iter. ETA=0:01:10
[09/05 15:30:57] d2.evaluation.evaluator INFO: Inference done 387/1000. Dataloading: 0.0644 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1073 s/iter. ETA=0:01:05
[09/05 15:31:02] d2.evaluation.evaluator INFO: Inference done 437/1000. Dataloading: 0.0638 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1068 s/iter. ETA=0:01:00
[09/05 15:31:07] d2.evaluation.evaluator INFO: Inference done 480/1000. Dataloading: 0.0647 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1077 s/iter. ETA=0:00:56
[09/05 15:31:12] d2.evaluation.evaluator INFO: Inference done 530/1000. Dataloading: 0.0640 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1070 s/iter. ETA=0:00:50
[09/05 15:31:17] d2.evaluation.evaluator INFO: Inference done 582/1000. Dataloading: 0.0632 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1062 s/iter. ETA=0:00:44
[09/05 15:31:22] d2.evaluation.evaluator INFO: Inference done 629/1000. Dataloading: 0.0636 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1065 s/iter. ETA=0:00:39
[09/05 15:31:27] d2.evaluation.evaluator INFO: Inference done 679/1000. Dataloading: 0.0631 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:34
[09/05 15:31:32] d2.evaluation.evaluator INFO: Inference done 726/1000. Dataloading: 0.0632 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1062 s/iter. ETA=0:00:29
[09/05 15:31:37] d2.evaluation.evaluator INFO: Inference done 770/1000. Dataloading: 0.0637 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1066 s/iter. ETA=0:00:24
[09/05 15:31:42] d2.evaluation.evaluator INFO: Inference done 819/1000. Dataloading: 0.0634 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1064 s/iter. ETA=0:00:19
[09/05 15:31:47] d2.evaluation.evaluator INFO: Inference done 869/1000. Dataloading: 0.0631 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:13
[09/05 15:31:52] d2.evaluation.evaluator INFO: Inference done 916/1000. Dataloading: 0.0632 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:08
[09/05 15:31:57] d2.evaluation.evaluator INFO: Inference done 970/1000. Dataloading: 0.0624 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1054 s/iter. ETA=0:00:03
[09/05 15:32:00] d2.evaluation.evaluator INFO: Total inference time: 0:01:44.354030 (0.104878 s / iter per device, on 1 devices)
[09/05 15:32:00] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:00:42 (0.042620 s / iter per device, on 1 devices)
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ...
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Saving results to output/yolox_s_person_ours/inference/coco_instances_results.json
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Evaluating predictions with unofficial COCO API...
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: Evaluate annotation type *bbox*
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: COCOeval_opt.evaluate() finished in 0.16 seconds.
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: Accumulating evaluation results...
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: COCOeval_opt.accumulate() finished in 0.03 seconds.
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Evaluation results for bbox:
| AP | AP50 | AP75 | APs | APm | APl |
|:-----:|:------:|:------:|:-----:|:-----:|:-----:|
| 0.602 | 1.953 | 0.197 | 0.000 | 0.111 | 2.269 |
[09/05 15:32:00] d2.engine.defaults INFO: Evaluation results for person_val in csv format:
[09/05 15:32:00] d2.evaluation.testing INFO: copypaste: Task: bbox
[09/05 15:32:00] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl
[09/05 15:32:00] d2.evaluation.testing INFO: copypaste: 0.6020,1.9528,0.1967,0.0001,0.1105,2.2693
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint: �[34mhead.cls_preds.0.{bias, weight}�[0m �[34mhead.cls_preds.1.{bias, weight}�[0m �[34mhead.cls_preds.2.{bias, weight}�[0m
As the log says the model parameters are not loaded. Therefore they are randomly initialized. Therefore the result will change depend on random seed.
Q: I trained a model with a dataset in coco format that I made myself, and now have two problems: 1) The score is high on the validation set, but low for other datasets of the same type. This doesn't have this problem on yolov5 and yolox. 2) On the same validation set, different random seeds will produce different test scores.
Instructions To Reproduce the Issue:
Full runnable code or full changes you made:
What exact command you run:
Full logs or other relevant observations: first logs:
[09/05 15:25:46] detectron2 INFO: Command line arguments: Namespace(config_file='configs/yolox/yolox_s_person_ours.yaml', dist_url='tcp://127.0.0.1:50158', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', 'output/yolox_s_person/model_0029999.pth'], resume=False) [09/05 15:25:46] detectron2 INFO: Contents of args.config_file=configs/yolox/yolox_s_person_ours.yaml: BASE: "Base-YOLO.yaml" MODEL: PIXEL_MEAN: [0.485, 0.456, 0.406] # same value as PP-YOLOv2, RGB order PIXEL_STD: [0.229, 0.224, 0.225]
WEIGHTS: "" MASK_ON: False META_ARCHITECTURE: "YOLOX" BACKBONE: NAME: "build_cspdarknetx_backbone"
DARKNET: WEIGHTS: "" DEPTH_WISE: False OUT_FEATURES: ["dark3", "dark4", "dark5"]
YOLO: CLASSES: 1 IN_FEATURES: ["dark3", "dark4", "dark5"] CONF_THRESHOLD: 0.001 NMS_THRESHOLD: 0.65 IGNORE_THRESHOLD: 0.7 WIDTH_MUL: 0.50 DEPTH_MUL: 0.33
LOSS_TYPE: "v7"
DATASETS: DATASET_ROOT: 'datasets/person_od' ANN_ROOT: 'datasets/person_od/annotations' TRAIN_IMAGE_PATH: 'train/images' VAL_IMAGE_PATH: 'val_ours/images' TRAIN_JSON_NAME: 'instances_train_person_od.json' VAL_JSON_NAME: 'instances_val_ours.json' TRAIN: ("person_train",) TEST: ("person_val",)
INPUT:
FORMAT: "RGB" # using BGR default
MIN_SIZE_TRAIN: (416, 512, 608, 768) MAX_SIZE_TRAIN: 800 # force max size train to 800? MIN_SIZE_TEST: 640 MAX_SIZE_TEST: 800
open all augmentations
JITTER_CROP: ENABLED: False RESIZE: ENABLED: True
SHAPE: (540, 960)
DISTORTION: ENABLED: True COLOR_JITTER: BRIGHTNESS: True SATURATION: True
MOSAIC:
ENABLED: True
NUM_IMAGES: 4
DEBUG_VIS: True
MOSAIC_WIDTH: 960
MOSAIC_HEIGHT: 540
MOSAIC_AND_MIXUP: ENABLED: True
ENABLED: False
SOLVER:
enable fp16 training
AMP: ENABLED: true IMS_PER_BATCH: 512 BASE_LR: 0.001 STEPS: (60000, 80000) WARMUP_FACTOR: 0.00033333 WARMUP_ITERS: 1200 MAX_ITER: 120000 LR_SCHEDULER_NAME: "WarmupCosineLR"
REFERENCE_WORLD_SIZE: 0
TEST: EVAL_PERIOD: 1000
EVAL_PERIOD: 0
OUTPUT_DIR: "output/yolox_s_person_ours" VIS_PERIOD: 1000
DATALOADER:
proposals are part of the dataset_dicts, and take a lot of RAM
NUM_WORKERS: 1 TEST_NUM_WORKERS: 1 [09/05 15:25:46] detectron2 INFO: Running with full config: CUDNN_BENCHMARK: false DATALOADER: ASPECT_RATIO_GROUPING: true FILTER_EMPTY_ANNOTATIONS: true NUM_WORKERS: 1 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler TEST_NUM_WORKERS: 1 DATASETS: ANN_ROOT: datasets/person_od/annotations CLASS_NAMES: [] DATASET_ROOT: datasets/person_od PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: [] PROPOSAL_FILES_TRAIN: [] TEST:
[09/05 15:27:59] d2.engine.defaults INFO: Evaluation results for person_val in csv format: [09/05 15:27:59] d2.evaluation.testing INFO: copypaste: Task: bbox [09/05 15:27:59] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl [09/05 15:27:59] d2.evaluation.testing INFO: copypaste: 1.9090,4.6076,1.3937,0.0000,0.5751,3.4674
Environment:
Paste the output of the following command:
The environment is running on the same virtual environment.