facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
30.04k stars 7.42k forks source link

About different random seeds will lead to different test scores #4524

Closed mary-0830 closed 1 year ago

mary-0830 commented 2 years ago

Q: I trained a model with a dataset in coco format that I made myself, and now have two problems: 1) The score is high on the validation set, but low for other datasets of the same type. This doesn't have this problem on yolov5 and yolox. 2) On the same validation set, different random seeds will produce different test scores.

Instructions To Reproduce the Issue:

  1. Full runnable code or full changes you made:

  2. What exact command you run:

    python trainers/train_yolox.py --config-file configs/yolox/yolox_s_person_ours.yaml --eval-only MODEL.WEIGHTS output/yolox_s_person/model_0029999.pth
  3. Full logs or other relevant observations: first logs:

    
    [09/05 09:42:22] d2.evaluation.testing INFO: copypaste: 2.9096,7.2661,1.8047,0.0000,1.1290,4.3732
    [09/05 15:25:42] detectron2 INFO: Rank of current process: 0. World size: 1
    [09/05 15:25:46] detectron2 INFO: Environment info:
    ----------------------  -----------------------------------------------------------------------------------
    sys.platform            linux
    Python                  3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
    numpy                   1.22.4
    detectron2              0.6 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/detectron2
    Compiler                GCC 7.3
    CUDA compiler           CUDA 11.1
    detectron2 arch flags   3.7, 5.0, 5.2, 6.0, 6.1, 7.0, 7.5, 8.0, 8.6
    DETECTRON2_ENV_MODULE   <not set>
    PyTorch                 1.8.0+cu111 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/torch
    PyTorch debug build     False
    GPU available           Yes
    GPU 0                   NVIDIA A100-PCIE-80GB (arch=8.0)
    Driver version          470.42.01
    CUDA_HOME               /usr/local/cuda-11.4
    Pillow                  8.2.0
    torchvision             0.9.0+cu111 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/torchvision
    torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
    fvcore                  0.1.5.post20220512
    iopath                  0.1.9
    cv2                     4.5.5
    ----------------------  -----------------------------------------------------------------------------------
    PyTorch built with:
    - GCC 7.3
    - C++ Version: 201402
    - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
    - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
    - OpenMP 201511 (a.k.a. OpenMP 4.5)
    - NNPACK is enabled
    - CPU capability usage: AVX2
    - CUDA Runtime 11.1
    - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
    - CuDNN 8.0.5
    - Magma 2.5.2
    - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

[09/05 15:25:46] detectron2 INFO: Command line arguments: Namespace(config_file='configs/yolox/yolox_s_person_ours.yaml', dist_url='tcp://127.0.0.1:50158', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', 'output/yolox_s_person/model_0029999.pth'], resume=False) [09/05 15:25:46] detectron2 INFO: Contents of args.config_file=configs/yolox/yolox_s_person_ours.yaml: BASE: "Base-YOLO.yaml" MODEL: PIXEL_MEAN: [0.485, 0.456, 0.406] # same value as PP-YOLOv2, RGB order PIXEL_STD: [0.229, 0.224, 0.225]

WEIGHTS: "" MASK_ON: False META_ARCHITECTURE: "YOLOX" BACKBONE: NAME: "build_cspdarknetx_backbone"

DARKNET: WEIGHTS: "" DEPTH_WISE: False OUT_FEATURES: ["dark3", "dark4", "dark5"]

YOLO: CLASSES: 1 IN_FEATURES: ["dark3", "dark4", "dark5"] CONF_THRESHOLD: 0.001 NMS_THRESHOLD: 0.65 IGNORE_THRESHOLD: 0.7 WIDTH_MUL: 0.50 DEPTH_MUL: 0.33

LOSS_TYPE: "v7"

LOSS:
  LAMBDA_IOU: 1.5

DATASETS: DATASET_ROOT: 'datasets/person_od' ANN_ROOT: 'datasets/person_od/annotations' TRAIN_IMAGE_PATH: 'train/images' VAL_IMAGE_PATH: 'val_ours/images' TRAIN_JSON_NAME: 'instances_train_person_od.json' VAL_JSON_NAME: 'instances_val_ours.json' TRAIN: ("person_train",) TEST: ("person_val",)

INPUT:

FORMAT: "RGB" # using BGR default

MIN_SIZE_TRAIN: (416, 512, 608, 768) MAX_SIZE_TRAIN: 800 # force max size train to 800? MIN_SIZE_TEST: 640 MAX_SIZE_TEST: 800

open all augmentations

JITTER_CROP: ENABLED: False RESIZE: ENABLED: True

SHAPE: (540, 960)

DISTORTION: ENABLED: True COLOR_JITTER: BRIGHTNESS: True SATURATION: True

MOSAIC:

ENABLED: True

NUM_IMAGES: 4

DEBUG_VIS: True

MOSAIC_WIDTH: 960

MOSAIC_HEIGHT: 540

MOSAIC_AND_MIXUP: ENABLED: True

ENABLED: False

DEBUG_VIS: False
ENABLE_MIXUP: False
DISABLE_AT_ITER: 120000

SOLVER:

enable fp16 training

AMP: ENABLED: true IMS_PER_BATCH: 512 BASE_LR: 0.001 STEPS: (60000, 80000) WARMUP_FACTOR: 0.00033333 WARMUP_ITERS: 1200 MAX_ITER: 120000 LR_SCHEDULER_NAME: "WarmupCosineLR"

REFERENCE_WORLD_SIZE: 0

TEST: EVAL_PERIOD: 1000

EVAL_PERIOD: 0

OUTPUT_DIR: "output/yolox_s_person_ours" VIS_PERIOD: 1000

DATALOADER:

proposals are part of the dataset_dicts, and take a lot of RAM

NUM_WORKERS: 1 TEST_NUM_WORKERS: 1 [09/05 15:25:46] detectron2 INFO: Running with full config: CUDNN_BENCHMARK: false DATALOADER: ASPECT_RATIO_GROUPING: true FILTER_EMPTY_ANNOTATIONS: true NUM_WORKERS: 1 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler TEST_NUM_WORKERS: 1 DATASETS: ANN_ROOT: datasets/person_od/annotations CLASS_NAMES: [] DATASET_ROOT: datasets/person_od PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: [] PROPOSAL_FILES_TRAIN: [] TEST:

[09/05 15:25:46] detectron2 INFO: Full config saved to output/yolox_s_person_ours/config.yaml [09/05 15:25:46] d2.utils.env INFO: Using a generated random seed 47044674 [09/05 15:26:10] d2.engine.defaults INFO: Model: YOLOX( (backbone): CSPDarknet( (stem): Focus( (conv): BaseConv( (conv): Conv2d(12, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (dark2): Sequential( (0): BaseConv( (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): CSPLayer( (conv1): BaseConv( (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) ) (dark3): Sequential( (0): BaseConv( (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): CSPLayer( (conv1): BaseConv( (conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (1): Bottleneck( (conv1): BaseConv( (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (2): Bottleneck( (conv1): BaseConv( (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) ) (dark4): Sequential( (0): BaseConv( (conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): CSPLayer( (conv1): BaseConv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (1): Bottleneck( (conv1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (2): Bottleneck( (conv1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) ) (dark5): Sequential( (0): BaseConv( (conv): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): SPPBottleneck( (conv1): BaseConv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): ModuleList( (0): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False) (1): MaxPool2d(kernel_size=9, stride=1, padding=4, dilation=1, ceil_mode=False) (2): MaxPool2d(kernel_size=13, stride=1, padding=6, dilation=1, ceil_mode=False) ) (conv2): BaseConv( (conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (2): CSPLayer( (conv1): BaseConv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) ) ) (neck): YOLOPAFPN( (upsample): Upsample(scale_factor=2.0, mode=nearest) (lateral_conv0): BaseConv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (C3_p4): CSPLayer( (conv1): BaseConv( (conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) (reduce_conv1): BaseConv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (C3_p3): CSPLayer( (conv1): BaseConv( (conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) (bu_conv2): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (C3_n3): CSPLayer( (conv1): BaseConv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) (bu_conv1): BaseConv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (C3_n4): CSPLayer( (conv1): BaseConv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv3): BaseConv( (conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (m): Sequential( (0): Bottleneck( (conv1): BaseConv( (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (conv2): BaseConv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) ) ) (head): YOLOXHead( (cls_convs): ModuleList( (0): Sequential( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (1): Sequential( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (2): Sequential( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) (reg_convs): ModuleList( (0): Sequential( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (1): Sequential( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (2): Sequential( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) ) (cls_preds): ModuleList( (0): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1)) (1): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1)) (2): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1)) ) (reg_preds): ModuleList( (0): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1)) (1): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1)) (2): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1)) ) (obj_preds): ModuleList( (0): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1)) (1): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1)) (2): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1)) ) (stems): ModuleList( (0): BaseConv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (1): BaseConv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) (2): BaseConv( (conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) (act): SiLU(inplace=True) ) ) (l1_loss): L1Loss() (bcewithlog_loss): BCEWithLogitsLoss() (iou_loss): IOUloss() ) ) [09/05 15:26:10] fvcore.common.checkpoint INFO: [Checkpointer] Loading from output/yolox_s_person/model_0029999.pth ... [09/05 15:26:12] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.0.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected. [09/05 15:26:12] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.0.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected. [09/05 15:26:12] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.1.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected. [09/05 15:26:12] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.1.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected. [09/05 15:26:12] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.2.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected. [09/05 15:26:12] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.2.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected. [09/05 15:26:12] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint: head.cls_preds.0.{bias, weight} head.cls_preds.1.{bias, weight} head.cls_preds.2.{bias, weight} [09/05 15:26:12] d2.data.datasets.coco INFO: Loaded 1000 images in COCO format from datasets/person_od/annotations/instances_val_ours.json [09/05 15:26:12] d2.data.build INFO: Distribution of instances among all 1 categories:  category #instances
person 3888

[09/05 15:26:12] d2.data.dataset_mapper INFO: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(640, 640), max_size=800, sample_style='choice')] [09/05 15:26:12] d2.data.common INFO: Serializing 1000 elements to byte tensors and concatenating them all ... [09/05 15:26:12] d2.data.common INFO: Serialized dataset takes 0.53 MiB [09/05 15:26:12] d2.evaluation.evaluator INFO: Start inference on 1000 batches [09/05 15:26:14] d2.evaluation.evaluator INFO: Inference done 11/1000. Dataloading: 0.0483 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.0913 s/iter. ETA=0:01:30 [09/05 15:26:20] d2.evaluation.evaluator INFO: Inference done 57/1000. Dataloading: 0.0646 s/iter. Inference: 0.0432 s/iter. Eval: 0.0003 s/iter. Total: 0.1082 s/iter. ETA=0:01:41 [09/05 15:26:25] d2.evaluation.evaluator INFO: Inference done 95/1000. Dataloading: 0.0759 s/iter. Inference: 0.0431 s/iter. Eval: 0.0003 s/iter. Total: 0.1194 s/iter. ETA=0:01:48 [09/05 15:26:30] d2.evaluation.evaluator INFO: Inference done 143/1000. Dataloading: 0.0710 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1144 s/iter. ETA=0:01:38 [09/05 15:26:35] d2.evaluation.evaluator INFO: Inference done 189/1000. Dataloading: 0.0701 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1135 s/iter. ETA=0:01:32 [09/05 15:26:40] d2.evaluation.evaluator INFO: Inference done 240/1000. Dataloading: 0.0669 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1103 s/iter. ETA=0:01:23 [09/05 15:26:45] d2.evaluation.evaluator INFO: Inference done 292/1000. Dataloading: 0.0646 s/iter. Inference: 0.0428 s/iter. Eval: 0.0003 s/iter. Total: 0.1078 s/iter. ETA=0:01:16 [09/05 15:26:50] d2.evaluation.evaluator INFO: Inference done 338/1000. Dataloading: 0.0653 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1083 s/iter. ETA=0:01:11 [09/05 15:26:55] d2.evaluation.evaluator INFO: Inference done 384/1000. Dataloading: 0.0654 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1083 s/iter. ETA=0:01:06 [09/05 15:27:00] d2.evaluation.evaluator INFO: Inference done 432/1000. Dataloading: 0.0653 s/iter. Inference: 0.0425 s/iter. Eval: 0.0003 s/iter. Total: 0.1081 s/iter. ETA=0:01:01 [09/05 15:27:05] d2.evaluation.evaluator INFO: Inference done 478/1000. Dataloading: 0.0654 s/iter. Inference: 0.0425 s/iter. Eval: 0.0003 s/iter. Total: 0.1083 s/iter. ETA=0:00:56 [09/05 15:27:10] d2.evaluation.evaluator INFO: Inference done 523/1000. Dataloading: 0.0656 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1086 s/iter. ETA=0:00:51 [09/05 15:27:15] d2.evaluation.evaluator INFO: Inference done 574/1000. Dataloading: 0.0648 s/iter. Inference: 0.0425 s/iter. Eval: 0.0003 s/iter. Total: 0.1076 s/iter. ETA=0:00:45 [09/05 15:27:20] d2.evaluation.evaluator INFO: Inference done 623/1000. Dataloading: 0.0644 s/iter. Inference: 0.0425 s/iter. Eval: 0.0003 s/iter. Total: 0.1073 s/iter. ETA=0:00:40 [09/05 15:27:25] d2.evaluation.evaluator INFO: Inference done 673/1000. Dataloading: 0.0639 s/iter. Inference: 0.0425 s/iter. Eval: 0.0003 s/iter. Total: 0.1068 s/iter. ETA=0:00:34 [09/05 15:27:30] d2.evaluation.evaluator INFO: Inference done 718/1000. Dataloading: 0.0640 s/iter. Inference: 0.0428 s/iter. Eval: 0.0003 s/iter. Total: 0.1071 s/iter. ETA=0:00:30 [09/05 15:27:35] d2.evaluation.evaluator INFO: Inference done 765/1000. Dataloading: 0.0641 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1071 s/iter. ETA=0:00:25 [09/05 15:27:40] d2.evaluation.evaluator INFO: Inference done 817/1000. Dataloading: 0.0635 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1065 s/iter. ETA=0:00:19 [09/05 15:27:45] d2.evaluation.evaluator INFO: Inference done 867/1000. Dataloading: 0.0631 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:14 [09/05 15:27:51] d2.evaluation.evaluator INFO: Inference done 916/1000. Dataloading: 0.0630 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1060 s/iter. ETA=0:00:08 [09/05 15:27:56] d2.evaluation.evaluator INFO: Inference done 973/1000. Dataloading: 0.0620 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1050 s/iter. ETA=0:00:02 [09/05 15:27:58] d2.evaluation.evaluator INFO: Total inference time: 0:01:43.996310 (0.104519 s / iter per device, on 1 devices) [09/05 15:27:58] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:00:42 (0.042734 s / iter per device, on 1 devices) [09/05 15:27:58] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ... [09/05 15:27:58] d2.evaluation.coco_evaluation INFO: Saving results to output/yolox_s_person_ours/inference/coco_instances_results.json [09/05 15:27:58] d2.evaluation.coco_evaluation INFO: Evaluating predictions with unofficial COCO API... [09/05 15:27:58] d2.evaluation.fast_eval_api INFO: Evaluate annotation type bbox [09/05 15:27:59] d2.evaluation.fast_eval_api INFO: COCOeval_opt.evaluate() finished in 0.21 seconds. [09/05 15:27:59] d2.evaluation.fast_eval_api INFO: Accumulating evaluation results... [09/05 15:27:59] d2.evaluation.fast_eval_api INFO: COCOeval_opt.accumulate() finished in 0.04 seconds. [09/05 15:27:59] d2.evaluation.coco_evaluation INFO: Evaluation results for bbox: AP AP50 AP75 APs APm APl
1.909 4.608 1.394 0.000 0.575 3.467

[09/05 15:27:59] d2.engine.defaults INFO: Evaluation results for person_val in csv format: [09/05 15:27:59] d2.evaluation.testing INFO: copypaste: Task: bbox [09/05 15:27:59] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl [09/05 15:27:59] d2.evaluation.testing INFO: copypaste: 1.9090,4.6076,1.3937,0.0000,0.5751,3.4674

Environment:

Paste the output of the following command:

The environment is running on the same virtual environment.

mary-0830 commented 2 years ago

second logs:

[09/05 15:29:54] detectron2 INFO: Rank of current process: 0. World size: 1
[09/05 15:29:58] detectron2 INFO: Environment info:
----------------------  -----------------------------------------------------------------------------------
sys.platform            linux
Python                  3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
numpy                   1.22.4
detectron2              0.6 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/detectron2
Compiler                GCC 7.3
CUDA compiler           CUDA 11.1
detectron2 arch flags   3.7, 5.0, 5.2, 6.0, 6.1, 7.0, 7.5, 8.0, 8.6
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.8.0+cu111 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/torch
PyTorch debug build     False
GPU available           Yes
GPU 0                   NVIDIA A100-PCIE-80GB (arch=8.0)
Driver version          470.42.01
CUDA_HOME               /usr/local/cuda-11.4
Pillow                  8.2.0
torchvision             0.9.0+cu111 @/home/ljj/anaconda3/envs/tuner/lib/python3.8/site-packages/torchvision
torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore                  0.1.5.post20220512
iopath                  0.1.9
cv2                     4.5.5
----------------------  -----------------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

[09/05 15:29:58] detectron2 INFO: Command line arguments: Namespace(config_file='configs/yolox/yolox_s_person_ours.yaml', dist_url='tcp://127.0.0.1:50158', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', 'output/yolox_s_person/model_0029999.pth'], resume=False)
[09/05 15:29:58] detectron2 INFO: Contents of args.config_file=configs/yolox/yolox_s_person_ours.yaml:
_BASE_: "Base-YOLO.yaml"
MODEL:
  PIXEL_MEAN: [0.485, 0.456, 0.406] # same value as PP-YOLOv2, RGB order
  PIXEL_STD: [0.229, 0.224, 0.225]

  WEIGHTS: ""
  MASK_ON: False
  META_ARCHITECTURE: "YOLOX"
  BACKBONE:
    NAME: "build_cspdarknetx_backbone"

  DARKNET:
    WEIGHTS: ""
    DEPTH_WISE: False
    OUT_FEATURES: ["dark3", "dark4", "dark5"]

  YOLO:
    CLASSES: 1
    IN_FEATURES: ["dark3", "dark4", "dark5"]
    CONF_THRESHOLD: 0.001
    NMS_THRESHOLD: 0.65
    IGNORE_THRESHOLD: 0.7
    WIDTH_MUL: 0.50
    DEPTH_MUL: 0.33
    # LOSS_TYPE: "v7"
    LOSS:
      LAMBDA_IOU: 1.5

DATASETS:
  DATASET_ROOT: 'datasets/person_od'
  ANN_ROOT: 'datasets/person_od/annotations'
  TRAIN_IMAGE_PATH: 'train/images'
  VAL_IMAGE_PATH: 'val_ours/images'
  TRAIN_JSON_NAME: 'instances_train_person_od.json'
  VAL_JSON_NAME: 'instances_val_ours.json'
  TRAIN: ("person_train",)
  TEST: ("person_val",)

INPUT:
  # FORMAT: "RGB" # using BGR default
  MIN_SIZE_TRAIN: (416, 512, 608, 768)
  MAX_SIZE_TRAIN: 800 # force max size train to 800?
  MIN_SIZE_TEST: 640
  MAX_SIZE_TEST: 800
  # open all augmentations
  JITTER_CROP:
    ENABLED: False
  RESIZE:
    ENABLED: True
    # SHAPE: (540, 960)
  DISTORTION:
    ENABLED: True
  COLOR_JITTER:
    BRIGHTNESS: True
    SATURATION: True
  # MOSAIC:
  #   ENABLED: True
  #   NUM_IMAGES: 4
  #   DEBUG_VIS: True
  #   # MOSAIC_WIDTH: 960
  #   # MOSAIC_HEIGHT: 540
  MOSAIC_AND_MIXUP:
    ENABLED: True
    # ENABLED: False
    DEBUG_VIS: False
    ENABLE_MIXUP: False
    DISABLE_AT_ITER: 120000

SOLVER:
  # enable fp16 training
  AMP:
    ENABLED: true
  IMS_PER_BATCH: 512
  BASE_LR: 0.001 
  STEPS: (60000, 80000)
  WARMUP_FACTOR: 0.00033333
  WARMUP_ITERS: 1200
  MAX_ITER: 120000
  LR_SCHEDULER_NAME: "WarmupCosineLR"
  # REFERENCE_WORLD_SIZE: 0

TEST:
  EVAL_PERIOD: 1000
  # EVAL_PERIOD: 0
OUTPUT_DIR: "output/yolox_s_person_ours"
VIS_PERIOD: 1000

DATALOADER:
  # proposals are part of the dataset_dicts, and take a lot of RAM
  NUM_WORKERS: 1
  TEST_NUM_WORKERS: 1
[09/05 15:29:58] detectron2 INFO: Running with full config:
CUDNN_BENCHMARK: false
DATALOADER:
  ASPECT_RATIO_GROUPING: true
  FILTER_EMPTY_ANNOTATIONS: true
  NUM_WORKERS: 1
  REPEAT_THRESHOLD: 0.0
  SAMPLER_TRAIN: TrainingSampler
  TEST_NUM_WORKERS: 1
DATASETS:
  ANN_ROOT: datasets/person_od/annotations
  CLASS_NAMES: []
  DATASET_ROOT: datasets/person_od
  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
  PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
  PROPOSAL_FILES_TEST: []
  PROPOSAL_FILES_TRAIN: []
  TEST:
  - person_val
  TRAIN:
  - person_train
  TRAIN_IMAGE_PATH: train/images
  TRAIN_JSON_NAME: instances_train_person_od.json
  VAL_IMAGE_PATH: val_ours/images
  VAL_JSON_NAME: instances_val_ours.json
GLOBAL:
  HACK: 1.0
INPUT:
  COLOR_JITTER:
    BRIGHTNESS: true
    LIGHTING: false
    SATURATION: true
  CROP:
    ENABLED: false
    SIZE:
    - 0.9
    - 0.9
    TYPE: relative_range
  DISTORTION:
    ENABLED: true
    EXPOSURE: 1.5
    HUE: 0.1
    SATURATION: 1.5
  FORMAT: BGR
  GRID_MASK:
    ENABLED: false
    MODE: 1
    PROB: 0.3
    USE_HEIGHT: true
    USE_WIDTH: true
  INPUT_SIZE:
  - 640
  - 640
  JITTER_CROP:
    ENABLED: false
    JITTER_RATIO: 0.3
  MASK_FORMAT: polygon
  MAX_SIZE_TEST: 800
  MAX_SIZE_TRAIN: 800
  MIN_SIZE_TEST: 640
  MIN_SIZE_TRAIN:
  - 416
  - 512
  - 608
  - 768
  MIN_SIZE_TRAIN_SAMPLING: choice
  MOSAIC:
    DEBUG_VIS: false
    ENABLED: false
    MIN_OFFSET: 0.2
    MOSAIC_HEIGHT: 640
    MOSAIC_WIDTH: 640
    NUM_IMAGES: 4
    POOL_CAPACITY: 1000
  MOSAIC_AND_MIXUP:
    DEBUG_VIS: false
    DEGREES: 10.0
    DISABLE_AT_ITER: 120000
    ENABLED: true
    ENABLE_MIXUP: false
    MOSAIC_HEIGHT_RANGE: &id001
    - 512
    - 800
    MOSAIC_WIDTH_RANGE: *id001
    MSCALE:
    - 0.5
    - 1.5
    NUM_IMAGES: 4
    PERSPECTIVE: 0.0
    POOL_CAPACITY: 1000
    SCALE:
    - 0.5
    - 1.5
    SHEAR: 2.0
    TRANSLATE: 0.1
  RANDOM_FLIP: horizontal
  RESIZE:
    ENABLED: true
    SCALE_JITTER:
    - 0.8
    - 1.2
    SHAPE:
    - 640
    - 640
    TEST_SHAPE:
    - 608
    - 608
  SHIFT:
    SHIFT_PIXELS: 32
MODEL:
  ANCHOR_GENERATOR:
    ANGLES:
    - - -90
      - 0
      - 90
    ASPECT_RATIOS:
    - - 0.5
      - 1.0
      - 2.0
    NAME: DefaultAnchorGenerator
    OFFSET: 0.0
    SIZES:
    - - 32
      - 64
      - 128
      - 256
      - 512
  BACKBONE:
    CHANNEL: 0
    FREEZE_AT: 2
    NAME: build_cspdarknetx_backbone
    SIMPLE: false
    STRIDE: 1
  BIFPN:
    NORM: GN
    NUM_BIFPN: 6
    NUM_LEVELS: 5
    OUT_CHANNELS: 160
    SEPARABLE_CONV: false
  DARKNET:
    DEPTH: 53
    DEPTH_WISE: false
    NORM: BN
    OUT_FEATURES:
    - dark3
    - dark4
    - dark5
    RES5_DILATION: 1
    STEM_OUT_CHANNELS: 32
    WEIGHTS: ''
    WITH_CSP: true
  DETR:
    ATTENTION_TYPE: DETR
    BBOX_EMBED_NUM_LAYERS: 3
    CENTERED_POSITION_ENCODIND: false
    CLS_WEIGHT: 1.0
    DECODER_BLOCK_GRAD: true
    DEC_LAYERS: 6
    DEEP_SUPERVISION: true
    DEFORMABLE: false
    DIM_FEEDFORWARD: 2048
    DROPOUT: 0.1
    ENC_LAYERS: 6
    FROZEN_WEIGHTS: ''
    GIOU_WEIGHT: 2.0
    HIDDEN_DIM: 256
    L1_WEIGHT: 5.0
    NHEADS: 8
    NO_OBJECT_WEIGHT: 0.1
    NUM_CLASSES: 80
    NUM_FEATURE_LEVELS: 1
    NUM_OBJECT_QUERIES: 100
    NUM_QUERY_PATTERN: 3
    NUM_QUERY_POSITION: 300
    PRE_NORM: false
    SPATIAL_PRIOR: learned
    TWO_STAGE: false
    USE_FOCAL_LOSS: false
    WITH_BOX_REFINE: false
  DEVICE: cuda
  EFFICIENTNET:
    FEATURE_INDICES:
    - 1
    - 4
    - 10
    - 15
    NAME: efficientnet_b0
    OUT_FEATURES:
    - stride4
    - stride8
    - stride16
    - stride32
    PRETRAINED: true
  FBNET_V2:
    ARCH: default
    ARCH_DEF: []
    NORM: bn
    NORM_ARGS: []
    OUT_FEATURES:
    - trunk3
    SCALE_FACTOR: 1.0
    STEM_IN_CHANNELS: 3
    WIDTH_DIVISOR: 1
  FPN:
    FUSE_TYPE: sum
    IN_FEATURES: []
    NORM: ''
    OUT_CHANNELS: 256
    OUT_CHANNELS_LIST:
    - 256
    - 512
    - 1024
    REPEAT: 2
  KEYPOINT_ON: false
  LOAD_PROPOSALS: false
  MASK_ON: false
  META_ARCHITECTURE: YOLOX
  NMS_TYPE: normal
  ONNX_EXPORT: false
  PADDED_VALUE: 114.0
  PANOPTIC_FPN:
    COMBINE:
      ENABLED: true
      INSTANCES_CONFIDENCE_THRESH: 0.5
      OVERLAP_THRESH: 0.5
      STUFF_AREA_LIMIT: 4096
    INSTANCE_LOSS_WEIGHT: 1.0
  PIXEL_MEAN:
  - 0.485
  - 0.456
  - 0.406
  PIXEL_STD:
  - 0.229
  - 0.224
  - 0.225
  PROPOSAL_GENERATOR:
    MIN_SIZE: 0
    NAME: RPN
  REGNETS:
    OUT_FEATURES:
    - s2
    - s3
    - s4
    TYPE: x
  RESNETS:
    DEFORM_MODULATED: false
    DEFORM_NUM_GROUPS: 1
    DEFORM_ON_PER_STAGE:
    - false
    - false
    - false
    - false
    DEPTH: 50
    NORM: FrozenBN
    NUM_GROUPS: 1
    OUT_FEATURES:
    - res4
    R2TYPE: res2net50_v1d
    RES2_OUT_CHANNELS: 256
    RES5_DILATION: 1
    STEM_OUT_CHANNELS: 64
    STRIDE_IN_1X1: true
    WIDTH_PER_GROUP: 64
  RETINANET:
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_WEIGHTS: &id003
    - 1.0
    - 1.0
    - 1.0
    - 1.0
    FOCAL_LOSS_ALPHA: 0.25
    FOCAL_LOSS_GAMMA: 2.0
    IN_FEATURES:
    - p3
    - p4
    - p5
    - p6
    - p7
    IOU_LABELS:
    - 0
    - -1
    - 1
    IOU_THRESHOLDS:
    - 0.4
    - 0.5
    NMS_THRESH_TEST: 0.5
    NORM: ''
    NUM_CLASSES: 80
    NUM_CONVS: 4
    PRIOR_PROB: 0.01
    SCORE_THRESH_TEST: 0.05
    SMOOTH_L1_LOSS_BETA: 0.1
    TOPK_CANDIDATES_TEST: 1000
  ROI_BOX_CASCADE_HEAD:
    BBOX_REG_WEIGHTS:
    - &id002
      - 10.0
      - 10.0
      - 5.0
      - 5.0
    - - 20.0
      - 20.0
      - 10.0
      - 10.0
    - - 30.0
      - 30.0
      - 15.0
      - 15.0
    IOUS:
    - 0.5
    - 0.6
    - 0.7
  ROI_BOX_HEAD:
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_LOSS_WEIGHT: 1.0
    BBOX_REG_WEIGHTS: *id002
    CLS_AGNOSTIC_BBOX_REG: false
    CONV_DIM: 256
    FC_DIM: 1024
    NAME: ''
    NORM: ''
    NUM_CONV: 0
    NUM_FC: 0
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
    SMOOTH_L1_BETA: 0.0
    TRAIN_ON_PRED_BOXES: false
  ROI_HEADS:
    BATCH_SIZE_PER_IMAGE: 512
    IN_FEATURES:
    - res4
    IOU_LABELS:
    - 0
    - 1
    IOU_THRESHOLDS:
    - 0.5
    NAME: Res5ROIHeads
    NMS_THRESH_TEST: 0.5
    NUM_CLASSES: 80
    POSITIVE_FRACTION: 0.25
    PROPOSAL_APPEND_GT: true
    SCORE_THRESH_TEST: 0.05
  ROI_KEYPOINT_HEAD:
    CONV_DIMS:
    - 512
    - 512
    - 512
    - 512
    - 512
    - 512
    - 512
    - 512
    LOSS_WEIGHT: 1.0
    MIN_KEYPOINTS_PER_IMAGE: 1
    NAME: KRCNNConvDeconvUpsampleHead
    NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
    NUM_KEYPOINTS: 17
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
  ROI_MASK_HEAD:
    CLS_AGNOSTIC_MASK: false
    CONV_DIM: 256
    NAME: MaskRCNNConvUpsampleHead
    NORM: ''
    NUM_CONV: 0
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
  RPN:
    BATCH_SIZE_PER_IMAGE: 256
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_LOSS_WEIGHT: 1.0
    BBOX_REG_WEIGHTS: *id003
    BOUNDARY_THRESH: -1
    CONV_DIMS:
    - -1
    HEAD_NAME: StandardRPNHead
    IN_FEATURES:
    - res4
    IOU_LABELS:
    - 0
    - -1
    - 1
    IOU_THRESHOLDS:
    - 0.3
    - 0.7
    LOSS_WEIGHT: 1.0
    NMS_THRESH: 0.7
    POSITIVE_FRACTION: 0.5
    POST_NMS_TOPK_TEST: 1000
    POST_NMS_TOPK_TRAIN: 2000
    PRE_NMS_TOPK_TEST: 6000
    PRE_NMS_TOPK_TRAIN: 12000
    SMOOTH_L1_BETA: 0.0
  SEM_SEG_HEAD:
    COMMON_STRIDE: 4
    CONVS_DIM: 128
    IGNORE_VALUE: 255
    IN_FEATURES:
    - p2
    - p3
    - p4
    - p5
    LOSS_WEIGHT: 1.0
    NAME: SemSegFPNHead
    NORM: GN
    NUM_CLASSES: 54
  SOLOV2:
    FPN_INSTANCE_STRIDES:
    - 8
    - 8
    - 16
    - 32
    - 32
    FPN_SCALE_RANGES:
    - - 1
      - 96
    - - 48
      - 192
    - - 96
      - 384
    - - 192
      - 768
    - - 384
      - 2048
    INSTANCE_CHANNELS: 512
    INSTANCE_IN_CHANNELS: 256
    INSTANCE_IN_FEATURES:
    - p2
    - p3
    - p4
    - p5
    - p6
    LOSS:
      DICE_WEIGHT: 3.0
      FOCAL_ALPHA: 0.25
      FOCAL_GAMMA: 2.0
      FOCAL_USE_SIGMOID: true
      FOCAL_WEIGHT: 1.0
    MASK_CHANNELS: 128
    MASK_IN_CHANNELS: 256
    MASK_IN_FEATURES:
    - p2
    - p3
    - p4
    - p5
    MASK_THR: 0.5
    MAX_PER_IMG: 100
    NMS_KERNEL: gaussian
    NMS_PRE: 500
    NMS_SIGMA: 2
    NMS_TYPE: matrix
    NORM: GN
    NUM_CLASSES: 80
    NUM_GRIDS:
    - 40
    - 36
    - 24
    - 16
    - 12
    NUM_INSTANCE_CONVS: 4
    NUM_KERNELS: 256
    NUM_MASKS: 256
    PRIOR_PROB: 0.01
    SCORE_THR: 0.1
    SIGMA: 0.2
    TYPE_DCN: DCN
    UPDATE_THR: 0.05
    USE_COORD_CONV: true
    USE_DCN_IN_INSTANCE: false
  SWIN:
    DEPTHS:
    - 2
    - 2
    - 6
    - 2
    OUT_FEATURES:
    - 1
    - 2
    - 3
    PATCH: 4
    TYPE: tiny
    WEIGHTS: ''
    WINDOW: 7
  VT_FPN:
    HEADS: 16
    IN_FEATURES:
    - res2
    - res3
    - res4
    - res5
    LAYERS: 3
    MIN_GROUP_PLANES: 64
    NORM: BN
    OUT_CHANNELS: 256
    POS_HWS: []
    POS_N_DOWNSAMPLE: []
    TOKEN_C: 1024
    TOKEN_LS:
    - 16
    - 16
    - 8
    - 8
  WEIGHTS: output/yolox_s_person/model_0029999.pth
  YOLO:
    ANCHORS:
    - - - 116
        - 90
      - - 156
        - 198
      - - 373
        - 326
    - - - 30
        - 61
      - - 62
        - 45
      - - 42
        - 119
    - - - 10
        - 13
      - - 16
        - 30
      - - 33
        - 23
    ANCHOR_MASK: []
    BRANCH_DILATIONS:
    - 1
    - 2
    - 3
    CLASSES: 1
    CONF_THRESHOLD: 0.001
    DEPTH_MUL: 0.33
    IGNORE_THRESHOLD: 0.7
    IN_FEATURES:
    - dark3
    - dark4
    - dark5
    IOU_TYPE: ciou
    LOSS:
      ANCHOR_RATIO_THRESH: 4.0
      BUILD_TARGET_TYPE: default
      LAMBDA_CLS: 1.0
      LAMBDA_CONF: 1.0
      LAMBDA_IOU: 1.5
      LAMBDA_WH: 1.0
      LAMBDA_XY: 1.0
      USE_L1: true
    LOSS_TYPE: v4
    MAX_BOXES_NUM: 100
    NECK:
      TYPE: yolov3
      WITH_SPP: false
    NMS_THRESHOLD: 0.65
    NUM_BRANCH: 3
    ORIEN_HEAD:
      UP_CHANNELS: 64
    TEST_BRANCH_IDX: 1
    VARIANT: yolov3
    WIDTH_MUL: 0.5
OUTPUT_DIR: output/yolox_s_person_ours
SEED: -1
SOLVER:
  AMP:
    ENABLED: true
  AUTO_SCALING_METHODS:
  - default_scale_d2_configs
  - default_scale_quantization_configs
  BACKBONE_MULTIPLIER: 0.1
  BASE_LR: 0.001
  BIAS_LR_FACTOR: 1.0
  CHECKPOINT_PERIOD: 5000
  CLIP_GRADIENTS:
    CLIP_TYPE: value
    CLIP_VALUE: 1.0
    ENABLED: false
    NORM_TYPE: 2.0
  GAMMA: 0.1
  IMS_PER_BATCH: 512
  LR_MULTIPLIER_OVERWRITE: []
  LR_SCHEDULER:
    GAMMA: 0.1
    MAX_EPOCH: 500
    MAX_ITER: 40000
    NAME: WarmupMultiStepLR
    STEPS:
    - 30000
    WARMUP_FACTOR: 0.001
    WARMUP_ITERS: 1000
    WARMUP_METHOD: linear
  LR_SCHEDULER_NAME: WarmupCosineLR
  MAX_ITER: 120000
  MOMENTUM: 0.9
  NESTEROV: false
  OPTIMIZER: ADAMW
  REFERENCE_WORLD_SIZE: 8
  STEPS:
  - 60000
  - 80000
  WARMUP_FACTOR: 0.00033333
  WARMUP_ITERS: 1200
  WARMUP_METHOD: linear
  WEIGHT_DECAY: 0.0001
  WEIGHT_DECAY_BIAS: null
  WEIGHT_DECAY_EMBED: 0.0
  WEIGHT_DECAY_NORM: 0.0
TEST:
  AUG:
    ENABLED: false
    FLIP: true
    MAX_SIZE: 4000
    MIN_SIZES:
    - 400
    - 500
    - 600
    - 700
    - 800
    - 900
    - 1000
    - 1100
    - 1200
  DETECTIONS_PER_IMAGE: 100
  EVAL_PERIOD: 1000
  EXPECTED_RESULTS: []
  KEYPOINT_OKS_SIGMAS: []
  PRECISE_BN:
    ENABLED: false
    NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 1000

[09/05 15:29:58] detectron2 INFO: Full config saved to output/yolox_s_person_ours/config.yaml
[09/05 15:29:59] d2.utils.env INFO: Using a generated random seed 59113710
[09/05 15:30:13] d2.engine.defaults INFO: Model:
YOLOX(
  (backbone): CSPDarknet(
    (stem): Focus(
      (conv): BaseConv(
        (conv): Conv2d(12, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
    )
    (dark2): Sequential(
      (0): BaseConv(
        (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (1): CSPLayer(
        (conv1): BaseConv(
          (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv2): BaseConv(
          (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv3): BaseConv(
          (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (m): Sequential(
          (0): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
        )
      )
    )
    (dark3): Sequential(
      (0): BaseConv(
        (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (1): CSPLayer(
        (conv1): BaseConv(
          (conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv2): BaseConv(
          (conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv3): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (m): Sequential(
          (0): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
          (1): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
          (2): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
        )
      )
    )
    (dark4): Sequential(
      (0): BaseConv(
        (conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (1): CSPLayer(
        (conv1): BaseConv(
          (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv2): BaseConv(
          (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv3): BaseConv(
          (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (m): Sequential(
          (0): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
          (1): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
          (2): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
        )
      )
    )
    (dark5): Sequential(
      (0): BaseConv(
        (conv): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (1): SPPBottleneck(
        (conv1): BaseConv(
          (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (m): ModuleList(
          (0): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False)
          (1): MaxPool2d(kernel_size=9, stride=1, padding=4, dilation=1, ceil_mode=False)
          (2): MaxPool2d(kernel_size=13, stride=1, padding=6, dilation=1, ceil_mode=False)
        )
        (conv2): BaseConv(
          (conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
      (2): CSPLayer(
        (conv1): BaseConv(
          (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv2): BaseConv(
          (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (conv3): BaseConv(
          (conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (m): Sequential(
          (0): Bottleneck(
            (conv1): BaseConv(
              (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
              (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
            (conv2): BaseConv(
              (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
              (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
              (act): SiLU(inplace=True)
            )
          )
        )
      )
    )
  )
  (neck): YOLOPAFPN(
    (upsample): Upsample(scale_factor=2.0, mode=nearest)
    (lateral_conv0): BaseConv(
      (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
      (act): SiLU(inplace=True)
    )
    (C3_p4): CSPLayer(
      (conv1): BaseConv(
        (conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv2): BaseConv(
        (conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv3): BaseConv(
        (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (m): Sequential(
        (0): Bottleneck(
          (conv1): BaseConv(
            (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
          (conv2): BaseConv(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
            (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
        )
      )
    )
    (reduce_conv1): BaseConv(
      (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
      (act): SiLU(inplace=True)
    )
    (C3_p3): CSPLayer(
      (conv1): BaseConv(
        (conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv2): BaseConv(
        (conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv3): BaseConv(
        (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (m): Sequential(
        (0): Bottleneck(
          (conv1): BaseConv(
            (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
          (conv2): BaseConv(
            (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
            (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
        )
      )
    )
    (bu_conv2): BaseConv(
      (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
      (act): SiLU(inplace=True)
    )
    (C3_n3): CSPLayer(
      (conv1): BaseConv(
        (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv2): BaseConv(
        (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv3): BaseConv(
        (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (m): Sequential(
        (0): Bottleneck(
          (conv1): BaseConv(
            (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
          (conv2): BaseConv(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
            (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
        )
      )
    )
    (bu_conv1): BaseConv(
      (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
      (act): SiLU(inplace=True)
    )
    (C3_n4): CSPLayer(
      (conv1): BaseConv(
        (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv2): BaseConv(
        (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (conv3): BaseConv(
        (conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(512, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (m): Sequential(
        (0): Bottleneck(
          (conv1): BaseConv(
            (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
          (conv2): BaseConv(
            (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
            (bn): BatchNorm2d(256, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
            (act): SiLU(inplace=True)
          )
        )
      )
    )
  )
  (head): YOLOXHead(
    (cls_convs): ModuleList(
      (0): Sequential(
        (0): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (1): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
      (1): Sequential(
        (0): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (1): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
      (2): Sequential(
        (0): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (1): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
    )
    (reg_convs): ModuleList(
      (0): Sequential(
        (0): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (1): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
      (1): Sequential(
        (0): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (1): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
      (2): Sequential(
        (0): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
        (1): BaseConv(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
          (act): SiLU(inplace=True)
        )
      )
    )
    (cls_preds): ModuleList(
      (0): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
    )
    (reg_preds): ModuleList(
      (0): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(128, 4, kernel_size=(1, 1), stride=(1, 1))
    )
    (obj_preds): ModuleList(
      (0): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(128, 1, kernel_size=(1, 1), stride=(1, 1))
    )
    (stems): ModuleList(
      (0): BaseConv(
        (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (1): BaseConv(
        (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
      (2): BaseConv(
        (conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(128, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
        (act): SiLU(inplace=True)
      )
    )
    (l1_loss): L1Loss()
    (bcewithlog_loss): BCEWithLogitsLoss()
    (iou_loss): IOUloss()
  )
)
[09/05 15:30:13] fvcore.common.checkpoint INFO: [Checkpointer] Loading from output/yolox_s_person/model_0029999.pth ...
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.0.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.0.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.1.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.1.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.2.weight' to the model due to incompatible shapes: (80, 128, 1, 1) in the checkpoint but (1, 128, 1, 1) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Skip loading parameter 'head.cls_preds.2.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
[09/05 15:30:14] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint:
head.cls_preds.0.{bias, weight}
head.cls_preds.1.{bias, weight}
head.cls_preds.2.{bias, weight}
[09/05 15:30:14] d2.data.datasets.coco INFO: Loaded 1000 images in COCO format from datasets/person_od/annotations/instances_val_ours.json
[09/05 15:30:14] d2.data.build INFO: Distribution of instances among all 1 categories:
|  category  | #instances   |
|:----------:|:-------------|
|   person   | 3888         |
|            |              |
[09/05 15:30:14] d2.data.dataset_mapper INFO: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(640, 640), max_size=800, sample_style='choice')]
[09/05 15:30:14] d2.data.common INFO: Serializing 1000 elements to byte tensors and concatenating them all ...
[09/05 15:30:14] d2.data.common INFO: Serialized dataset takes 0.53 MiB
[09/05 15:30:14] d2.evaluation.evaluator INFO: Start inference on 1000 batches
[09/05 15:30:16] d2.evaluation.evaluator INFO: Inference done 11/1000. Dataloading: 0.0413 s/iter. Inference: 0.0404 s/iter. Eval: 0.0003 s/iter. Total: 0.0820 s/iter. ETA=0:01:21
[09/05 15:30:21] d2.evaluation.evaluator INFO: Inference done 57/1000. Dataloading: 0.0647 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1076 s/iter. ETA=0:01:41
[09/05 15:30:26] d2.evaluation.evaluator INFO: Inference done 96/1000. Dataloading: 0.0732 s/iter. Inference: 0.0434 s/iter. Eval: 0.0003 s/iter. Total: 0.1170 s/iter. ETA=0:01:45
[09/05 15:30:31] d2.evaluation.evaluator INFO: Inference done 148/1000. Dataloading: 0.0661 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1095 s/iter. ETA=0:01:33
[09/05 15:30:36] d2.evaluation.evaluator INFO: Inference done 196/1000. Dataloading: 0.0653 s/iter. Inference: 0.0430 s/iter. Eval: 0.0003 s/iter. Total: 0.1087 s/iter. ETA=0:01:27
[09/05 15:30:41] d2.evaluation.evaluator INFO: Inference done 247/1000. Dataloading: 0.0634 s/iter. Inference: 0.0428 s/iter. Eval: 0.0003 s/iter. Total: 0.1066 s/iter. ETA=0:01:20
[09/05 15:30:46] d2.evaluation.evaluator INFO: Inference done 298/1000. Dataloading: 0.0622 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1052 s/iter. ETA=0:01:13
[09/05 15:30:51] d2.evaluation.evaluator INFO: Inference done 342/1000. Dataloading: 0.0635 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1064 s/iter. ETA=0:01:10
[09/05 15:30:57] d2.evaluation.evaluator INFO: Inference done 387/1000. Dataloading: 0.0644 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1073 s/iter. ETA=0:01:05
[09/05 15:31:02] d2.evaluation.evaluator INFO: Inference done 437/1000. Dataloading: 0.0638 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1068 s/iter. ETA=0:01:00
[09/05 15:31:07] d2.evaluation.evaluator INFO: Inference done 480/1000. Dataloading: 0.0647 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1077 s/iter. ETA=0:00:56
[09/05 15:31:12] d2.evaluation.evaluator INFO: Inference done 530/1000. Dataloading: 0.0640 s/iter. Inference: 0.0427 s/iter. Eval: 0.0003 s/iter. Total: 0.1070 s/iter. ETA=0:00:50
[09/05 15:31:17] d2.evaluation.evaluator INFO: Inference done 582/1000. Dataloading: 0.0632 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1062 s/iter. ETA=0:00:44
[09/05 15:31:22] d2.evaluation.evaluator INFO: Inference done 629/1000. Dataloading: 0.0636 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1065 s/iter. ETA=0:00:39
[09/05 15:31:27] d2.evaluation.evaluator INFO: Inference done 679/1000. Dataloading: 0.0631 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:34
[09/05 15:31:32] d2.evaluation.evaluator INFO: Inference done 726/1000. Dataloading: 0.0632 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1062 s/iter. ETA=0:00:29
[09/05 15:31:37] d2.evaluation.evaluator INFO: Inference done 770/1000. Dataloading: 0.0637 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1066 s/iter. ETA=0:00:24
[09/05 15:31:42] d2.evaluation.evaluator INFO: Inference done 819/1000. Dataloading: 0.0634 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1064 s/iter. ETA=0:00:19
[09/05 15:31:47] d2.evaluation.evaluator INFO: Inference done 869/1000. Dataloading: 0.0631 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:13
[09/05 15:31:52] d2.evaluation.evaluator INFO: Inference done 916/1000. Dataloading: 0.0632 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1061 s/iter. ETA=0:00:08
[09/05 15:31:57] d2.evaluation.evaluator INFO: Inference done 970/1000. Dataloading: 0.0624 s/iter. Inference: 0.0426 s/iter. Eval: 0.0003 s/iter. Total: 0.1054 s/iter. ETA=0:00:03
[09/05 15:32:00] d2.evaluation.evaluator INFO: Total inference time: 0:01:44.354030 (0.104878 s / iter per device, on 1 devices)
[09/05 15:32:00] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:00:42 (0.042620 s / iter per device, on 1 devices)
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ...
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Saving results to output/yolox_s_person_ours/inference/coco_instances_results.json
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Evaluating predictions with unofficial COCO API...
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: Evaluate annotation type *bbox*
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: COCOeval_opt.evaluate() finished in 0.16 seconds.
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: Accumulating evaluation results...
[09/05 15:32:00] d2.evaluation.fast_eval_api INFO: COCOeval_opt.accumulate() finished in 0.03 seconds.
[09/05 15:32:00] d2.evaluation.coco_evaluation INFO: Evaluation results for bbox: 
|  AP   |  AP50  |  AP75  |  APs  |  APm  |  APl  |
|:-----:|:------:|:------:|:-----:|:-----:|:-----:|
| 0.602 | 1.953  | 0.197  | 0.000 | 0.111 | 2.269 |
[09/05 15:32:00] d2.engine.defaults INFO: Evaluation results for person_val in csv format:
[09/05 15:32:00] d2.evaluation.testing INFO: copypaste: Task: bbox
[09/05 15:32:00] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl
[09/05 15:32:00] d2.evaluation.testing INFO: copypaste: 0.6020,1.9528,0.1967,0.0001,0.1105,2.2693
ppwwyyxx commented 2 years ago

[09/05 15:30:14] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint: �[34mhead.cls_preds.0.{bias, weight}�[0m �[34mhead.cls_preds.1.{bias, weight}�[0m �[34mhead.cls_preds.2.{bias, weight}�[0m

As the log says the model parameters are not loaded. Therefore they are randomly initialized. Therefore the result will change depend on random seed.