Prerequisite

[X] I have searched the existing and past issues but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

💬 Describe the reimplementation questions

python tools/train.py configs/yolox/yolox_s_8xb8-300e_coco.py
2022/12/27 17:07:16 - mmengine - INFO - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
    CUDA available: True
    numpy_random_seed: 1502397530
    GPU 0,1: NVIDIA GeForce RTX 2080 Ti
    CUDA_HOME: /usr/local/cuda-10.1
    NVCC: Cuda compilation tools, release 10.1, V10.1.16
    GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
    PyTorch: 1.13.1+cu117
    PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.4  (built against CUDA 10.1)
    - Built with CuDNN 8.5
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

    TorchVision: 0.14.1+cu117
    OpenCV: 4.6.0
    MMEngine: 0.3.2

Runtime environment:
    cudnn_benchmark: False
    mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
    dist_cfg: {'backend': 'nccl'}
    seed: None
    Distributed launcher: none
    Distributed training: False
    GPU number: 1
------------------------------------------------------------

2022/12/27 17:07:16 - mmengine - INFO - Config:
default_scope = 'mmyolo'
default_hooks = dict(
    timer=dict(type='IterTimerHook'),
    logger=dict(type='LoggerHook', interval=50),
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(
        type='CheckpointHook', interval=1, max_keep_ckpts=3, save_best='auto'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    visualization=dict(type='mmdet.DetVisualizationHook'))
env_cfg = dict(
    cudnn_benchmark=False,
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
    dist_cfg=dict(backend='nccl'))
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
    type='mmdet.DetLocalVisualizer',
    vis_backends=[dict(type='LocalVisBackend')],
    name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = '/home/yxy/下载/yolox_s_8xb8-300e_coco_20220917_030738-d7e60cb2.pth'
resume = False
file_client_args = dict(backend='disk')
data_root = '/home/yxy/mmdetection/data/coco/'
dataset_type = 'YOLOv5CocoDataset'
img_scale = (640, 640)
deepen_factor = 0.33
widen_factor = 0.5
save_epoch_intervals = 10
train_batch_size_per_gpu = 8
train_num_workers = 8
val_batch_size_per_gpu = 1
val_num_workers = 2
max_epochs = 300
num_last_epochs = 15
model = dict(
    type='YOLODetector',
    init_cfg=dict(
        type='Kaiming',
        layer='Conv2d',
        a=2.23606797749979,
        distribution='uniform',
        mode='fan_in',
        nonlinearity='leaky_relu'),
    use_syncbn=False,
    data_preprocessor=dict(
        type='mmdet.DetDataPreprocessor',
        pad_size_divisor=32,
        batch_augments=[
            dict(
                type='mmdet.BatchSyncRandomResize',
                random_size_range=(480, 800),
                size_divisor=32,
                interval=10)
        ]),
    backbone=dict(
        type='YOLOXCSPDarknet',
        deepen_factor=0.33,
        widen_factor=0.5,
        out_indices=(2, 3, 4),
        spp_kernal_sizes=(5, 9, 13),
        norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
        act_cfg=dict(type='SiLU', inplace=True)),
    neck=dict(
        type='YOLOXPAFPN',
        deepen_factor=0.33,
        widen_factor=0.5,
        in_channels=[256, 512, 1024],
        out_channels=256,
        norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
        act_cfg=dict(type='SiLU', inplace=True)),
    bbox_head=dict(
        type='YOLOXHead',
        head_module=dict(
            type='YOLOXHeadModule',
            num_classes=4,
            in_channels=256,
            feat_channels=256,
            widen_factor=0.5,
            stacked_convs=2,
            featmap_strides=(8, 16, 32),
            use_depthwise=False,
            norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
            act_cfg=dict(type='SiLU', inplace=True)),
        loss_cls=dict(
            type='mmdet.CrossEntropyLoss',
            use_sigmoid=True,
            reduction='sum',
            loss_weight=1.0),
        loss_bbox=dict(
            type='mmdet.IoULoss',
            mode='square',
            eps=1e-16,
            reduction='sum',
            loss_weight=5.0),
        loss_obj=dict(
            type='mmdet.CrossEntropyLoss',
            use_sigmoid=True,
            reduction='sum',
            loss_weight=1.0),
        loss_bbox_aux=dict(
            type='mmdet.L1Loss', reduction='sum', loss_weight=1.0)),
    train_cfg=dict(
        assigner=dict(
            type='mmdet.SimOTAAssigner',
            center_radius=2.5,
            iou_calculator=dict(type='mmdet.BboxOverlaps2D'))),
    test_cfg=dict(
        yolox_style=True,
        multi_label=True,
        score_thr=0.001,
        max_per_img=300,
        nms=dict(type='nms', iou_threshold=0.65)))
pre_transform = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='LoadAnnotations', with_bbox=True)
]
train_pipeline_stage1 = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='Mosaic',
        img_scale=(640, 640),
        pad_val=114.0,
        pre_transform=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True)
        ]),
    dict(
        type='mmdet.RandomAffine',
        scaling_ratio_range=(0.1, 2),
        border=(-320, -320)),
    dict(
        type='YOLOXMixUp',
        img_scale=(640, 640),
        ratio_range=(0.8, 1.6),
        pad_val=114.0,
        pre_transform=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True)
        ]),
    dict(type='mmdet.YOLOXHSVRandomAug'),
    dict(type='mmdet.RandomFlip', prob=0.5),
    dict(
        type='mmdet.FilterAnnotations',
        min_gt_bbox_wh=(1, 1),
        keep_empty=False),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',
                   'flip_direction'))
]
train_pipeline_stage2 = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
    dict(
        type='mmdet.Pad',
        pad_to_square=True,
        pad_val=dict(img=(114.0, 114.0, 114.0))),
    dict(type='mmdet.YOLOXHSVRandomAug'),
    dict(type='mmdet.RandomFlip', prob=0.5),
    dict(
        type='mmdet.FilterAnnotations',
        min_gt_bbox_wh=(1, 1),
        keep_empty=False),
    dict(type='mmdet.PackDetInputs')
]
train_dataloader = dict(
    batch_size=8,
    num_workers=8,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type='YOLOv5CocoDataset',
        data_root='/home/yxy/mmdetection/data/coco/',
        ann_file=
        '/home/yxy/mmdetection/data/coco/annotations/instances_train2017.json',
        data_prefix=dict(img='train2017/'),
        filter_cfg=dict(filter_empty_gt=False, min_size=32),
        pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(
                type='Mosaic',
                img_scale=(640, 640),
                pad_val=114.0,
                pre_transform=[
                    dict(
                        type='LoadImageFromFile',
                        file_client_args=dict(backend='disk')),
                    dict(type='LoadAnnotations', with_bbox=True)
                ]),
            dict(
                type='mmdet.RandomAffine',
                scaling_ratio_range=(0.1, 2),
                border=(-320, -320)),
            dict(
                type='YOLOXMixUp',
                img_scale=(640, 640),
                ratio_range=(0.8, 1.6),
                pad_val=114.0,
                pre_transform=[
                    dict(
                        type='LoadImageFromFile',
                        file_client_args=dict(backend='disk')),
                    dict(type='LoadAnnotations', with_bbox=True)
                ]),
            dict(type='mmdet.YOLOXHSVRandomAug'),
            dict(type='mmdet.RandomFlip', prob=0.5),
            dict(
                type='mmdet.FilterAnnotations',
                min_gt_bbox_wh=(1, 1),
                keep_empty=False),
            dict(
                type='mmdet.PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'flip', 'flip_direction'))
        ]))
test_pipeline = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
    dict(
        type='mmdet.Pad',
        pad_to_square=True,
        pad_val=dict(img=(114.0, 114.0, 114.0))),
    dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]
val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    pin_memory=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='YOLOv5CocoDataset',
        data_root='/home/yxy/mmdetection/data/coco/',
        ann_file='/home/yxy/mmdetection/data/annotations/val.json',
        data_prefix=dict(img='train2017/'),
        test_mode=True,
        pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
            dict(
                type='mmdet.Pad',
                pad_to_square=True,
                pad_val=dict(img=(114.0, 114.0, 114.0))),
            dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
            dict(
                type='mmdet.PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'scale_factor'))
        ]))
test_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    pin_memory=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='YOLOv5CocoDataset',
        data_root='/home/yxy/mmdetection/data/coco/',
        ann_file='/home/yxy/mmdetection/data/annotations/val.json',
        data_prefix=dict(img='train2017/'),
        test_mode=True,
        pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
            dict(
                type='mmdet.Pad',
                pad_to_square=True,
                pad_val=dict(img=(114.0, 114.0, 114.0))),
            dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
            dict(
                type='mmdet.PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'scale_factor'))
        ]))
val_evaluator = dict(
    type='mmdet.CocoMetric',
    proposal_nums=(100, 1, 10),
    ann_file=
    '/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
    metric='bbox')
test_evaluator = dict(
    type='mmdet.CocoMetric',
    proposal_nums=(100, 1, 10),
    ann_file=
    '/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
    metric='bbox')
base_lr = 0.01
optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=dict(
        type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005, nesterov=True),
    paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0))
param_scheduler = [
    dict(
        type='mmdet.QuadraticWarmupLR',
        by_epoch=True,
        begin=0,
        end=5,
        convert_to_iter_based=True),
    dict(
        type='CosineAnnealingLR',
        eta_min=0.0005,
        begin=5,
        T_max=285,
        end=285,
        by_epoch=True,
        convert_to_iter_based=True),
    dict(type='ConstantLR', by_epoch=True, factor=1, begin=285, end=300)
]
custom_hooks = [
    dict(
        type='YOLOXModeSwitchHook',
        num_last_epochs=15,
        new_train_pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
            dict(
                type='mmdet.Pad',
                pad_to_square=True,
                pad_val=dict(img=(114.0, 114.0, 114.0))),
            dict(type='mmdet.YOLOXHSVRandomAug'),
            dict(type='mmdet.RandomFlip', prob=0.5),
            dict(
                type='mmdet.FilterAnnotations',
                min_gt_bbox_wh=(1, 1),
                keep_empty=False),
            dict(type='mmdet.PackDetInputs')
        ],
        priority=48),
    dict(type='mmdet.SyncNormHook', priority=48),
    dict(
        type='EMAHook',
        ema_type='ExpMomentumEMA',
        momentum=0.0001,
        update_buffers=True,
        strict_load=False,
        priority=49)
]
train_cfg = dict(
    type='EpochBasedTrainLoop',
    max_epochs=300,
    val_interval=10,
    dynamic_intervals=[(285, 1)])
auto_scale_lr = dict(base_batch_size=64)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
launcher = 'none'
work_dir = './work_dirs/yolox_s_8xb8-300e_coco'

2022/12/27 17:08:37 - mmengine - INFO - Epoch(train) [1][300/969] lr: 3.8340e-05 eta: 20:48:18 time: 0.2513 data_time: 0.0467 memory: 5362 loss: 0.7244 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.7244 2022/12/27 17:08:49 - mmengine - INFO - Epoch(train) [1][350/969] lr: 5.2185e-05 eta: 20:35:15 time: 0.2394 data_time: 0.0772 memory: 5362 loss: 0.4497 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.4497 2022/12/27 17:09:01 - mmengine - INFO - Epoch(train) [1][400/969] lr: 6.8160e-05 eta: 20:31:31 time: 0.2494 data_time: 0.0234 memory: 5362 loss: 0.5124 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.5124 2022/12/27 17:09:13 - mmengine - INFO - Epoch(train) [1][450/969] lr: 8.6266e-05 eta: 20:18:02 time: 0.2299 data_time: 0.0601 memory: 3875 loss: 0.2986 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2986 2022/12/27 17:09:26 - mmengine - INFO - Epoch(train) [1][500/969] lr: 1.0650e-04 eta: 20:20:20 time: 0.2570 data_time: 0.0642 memory: 4936 loss: 0.2919 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2919 2022/12/27 17:09:38 - mmengine - INFO - Epoch(train) [1][550/969] lr: 1.2887e-04 eta: 20:17:15 time: 0.2458 data_time: 0.0498 memory: 4228 loss: 0.2368 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2368 2022/12/27 17:09:51 - mmengine - INFO - Epoch(train) [1][600/969] lr: 1.5336e-04 eta: 20:18:08 time: 0.2544 data_time: 0.0160 memory: 5362 loss: 0.2318 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2318 2022/12/27 17:10:03 - mmengine - INFO - Epoch(train) [1][650/969] lr: 1.7999e-04 eta: 20:14:28 time: 0.2427 data_time: 0.0518 memory: 4552 loss: 0.1511 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.1511 2022/12/27 17:10:15 - mmengine - INFO - Epoch(train) [1][700/969] lr: 2.0874e-04 eta: 20:14:07 time: 0.2508 data_time: 0.0011 memory: 5362 loss: 0.1640 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.1640 2022/12/27 17:10:29 - mmengine - INFO - Epoch(train) [1][750/969] lr: 2.3963e-04 eta: 20:18:31 time: 0.2655 data_time: 0.0669 memory: 5362 loss: 0.1068 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.1068 2022/12/27 17:10:40 - mmengine - INFO - Epoch(train) [1][800/969] lr: 2.7264e-04 eta: 20:12:52 time: 0.2341 data_time: 0.0551 memory: 4936 loss: 0.0842 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.0842 2022/12/27 17:10:53 - mmengine - INFO - Epoch(train) [1][850/969] lr: 3.0779e-04 eta: 20:10:42 time: 0.2442 data_time: 0.0178 memory: 5362 loss: 0.0912 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.0912 2022/12/27 17:11:05 - mmengine - INFO - Epoch(train) [1][900/969] lr: 3.4506e-04 eta: 20:09:06 time: 0.2454 data_time: 0.0643 memory: 4936 loss: 0.0606 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.0606

Environment

I installed the environment followed the mmyolo install instructions.

Expected results

No response

Additional information

I only revised the num_classes to 4, and I revised the path of my datsset. and I check my dataset with the browse code, there is no mistake.

When i add metainfo =dict(CLASSES=('a','b',)) in my yolox_s_8xb8-300e_coco.py, there is no change. When i run the train code in mmdetection, the loss is normal. So I guess there may be mistake in the mmyolo code.

@yangxiaoyany Hi, I see that you did not add metainfo to the above configuration. you need to provide the correct configuration.

num_classes = 4##### metainfo = dict( # 根据 class_with_id.txt 类别信息，设置 metainfo

CLASSES=('cat',),

# PALETTE=[(220, 20, 60)]  # 画图时候的颜色，随便设置即可
CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)]

)hello，this is my metainfo。

hello, after I debuged for a long time, I found there is useless to add metainfo in the config file. because the code will always use the coco 80 class names as its category and in the yolox config file, there is no information about metainfo. my config file is same to the up code. Thank you very much for your reply.

python tools/analysis_tools/browse_dataset.py configs/custom/my_right_yolox_s.py --phase train /home/yxy/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmengine/model/utils.py:138: UserWarning: Cannot import torch.fx, merge_dict is a simple function to merge multiple dicts warnings.warn('Cannot import torch.fx, merge_dict is a simple function ' /home/yxy/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmdet/evaluation/metrics/lvis_metric.py:23: UserWarning: mmlvis is deprecated, please install official lvis-api by "pip install git+https://github.com/lvis-dataset/lvis-api.git" UserWarning) loading annotations into memory... Done (t=0.37s) creating index... index created! /home/yxy/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmengine/visualization/visualizer.py:170: UserWarning: Visualizer backend is not initialized because save_dir is None. warnings.warn('Visualizer backend is not initialized ' {'classes': ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'), 'palette': [(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230), (106, 0, 228), (0, 60, 100), (0, 80, 100), (0, 0, 70), (0, 0, 192), (250, 170, 30), (100, 170, 30), (220, 220, 0), (175, 116, 175), (250, 0, 30), (165, 42, 42), (255, 77, 255), (0, 226, 252), (182, 182, 255), (0, 82, 0), (120, 166, 157), (110, 76, 0), (174, 57, 255), (199, 100, 0), (72, 0, 118), (255, 179, 240), (0, 125, 92), (209, 0, 151), (188, 208, 182), (0, 220, 176), (255, 99, 164), (92, 0, 73), (133, 129, 255), (78, 180, 255), (0, 228, 0), (174, 255, 243), (45, 89, 255), (134, 134, 103), (145, 148, 174), (255, 208, 186), (197, 226, 255), (171, 134, 1), (109, 63, 54), (207, 138, 255), (151, 0, 95), (9, 80, 61), (84, 105, 51), (74, 65, 105), (166, 196, 102), (208, 195, 210), (255, 109, 65), (0, 143, 149), (179, 0, 194), (209, 99, 106), (5, 121, 0), (227, 255, 205), (147, 186, 208), (153, 69, 1), (3, 95, 161), (163, 255, 0), (119, 0, 170), (0, 182, 199), (0, 165, 120), (183, 130, 88), (95, 32, 0), (130, 114, 135), (110, 129, 133), (166, 74, 118), (219, 142, 185), (79, 210, 114), (178, 90, 62), (65, 70, 15), (127, 167, 115), (59, 105, 106), (142, 108, 45), (196, 172, 0), (95, 54, 80), (128, 76, 255), (201, 57, 1), (246, 0, 122), (191, 162, 208)], 'CLASSES': ('holothurian', 'echinus', 'scallop', 'starfish'), 'PALETTE': [(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)]}

my browse_dataset.py

Copyright (c) OpenMMLab. All rights reserved.

import argparse import os.path as osp import sys from typing import Tuple

import cv2 import mmcv import numpy as np from mmdet.models.utils import mask2ndarray from mmdet.structures.bbox import BaseBoxes from mmengine.config import Config, DictAction from mmengine.dataset import Compose from mmengine.utils import ProgressBar from mmengine.visualization import Visualizer

from mmyolo.registry import DATASETS, VISUALIZERS from mmyolo.utils import register_all_modules

TODO: Support for printing the change in key of results

def parse_args(): parser = argparse.ArgumentParser(description='Browse a dataset') parser.add_argument('config', help='train config file path') parser.add_argument( '--phase', '-p', default='train', type=str, choices=['train', 'test', 'val'], help='phase of dataset to visualize, accept "train" "test" and "val".' ' Defaults to "train".') parser.add_argument( '--mode', '-m', default='transformed', type=str, choices=['original', 'transformed', 'pipeline'], help='display mode; display original pictures or ' 'transformed pictures or comparison pictures. "original" ' 'means show images load from disk; "transformed" means ' 'to show images after transformed; "pipeline" means show all ' 'the intermediate images. Defaults to "transformed".') parser.add_argument( '--output-dir', default=None, type=str, help='If there is no display interface, you can save it.') parser.add_argument('--not-show', default=False, action='store_true') parser.add_argument( '--show-number', '-n', type=int, default=sys.maxsize, help='number of images selected to visualize, ' 'must bigger than 0. if the number is bigger than length ' 'of dataset, show all the images in dataset; ' 'default "sys.maxsize", show all images in dataset') parser.add_argument( '--show-interval', '-i', type=float, default=3, help='the interval of show (s)') parser.add_argument( '--cfg-options', nargs='+', action=DictAction, help='override some settings in the used config, the key-value pair ' 'in xxx=yyy format will be merged into config file. If the value to ' 'be overwritten is a list, it should be like key="[a,b]" or key=a,b ' 'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" ' 'Note that the quotation marks are necessary and that no white space ' 'is allowed.') args = parser.parse_args() return args

def _get_adaptive_scale(img_shape: Tuple[int, int], min_scale: float = 0.3, max_scale: float = 3.0) -> float: """Get adaptive scale according to image shape.

The target scale depends on the the short edge length of the image. If the
short edge length equals 224, the output is 1.0. And output linear
scales according the short edge length. You can also specify the minimum
scale and the maximum scale to limit the linear scale.

Args:
    img_shape (Tuple[int, int]): The shape of the canvas image.
    min_scale (int): The minimum scale. Defaults to 0.3.
    max_scale (int): The maximum scale. Defaults to 3.0.
Returns:
    int: The adaptive scale.
"""
short_edge_length = min(img_shape)
scale = short_edge_length / 224.
return min(max(scale, min_scale), max_scale)

def make_grid(imgs, names): """Concat list of pictures into a single big picture, align height here.""" visualizer = Visualizer.get_current_instance() ori_shapes = [img.shape[:2] for img in imgs] max_height = int(max(img.shape[0] for img in imgs) * 1.1) min_width = min(img.shape[1] for img in imgs) horizontal_gap = min_width // 10 img_scale = _get_adaptive_scale((max_height, min_width))

texts = []
text_positions = []
start_x = 0
for i, img in enumerate(imgs):
    pad_height = (max_height - img.shape[0]) // 2
    pad_width = horizontal_gap // 2
    # make border
    imgs[i] = cv2.copyMakeBorder(
        img,
        pad_height,
        max_height - img.shape[0] - pad_height + int(img_scale * 30 * 2),
        pad_width,
        pad_width,
        cv2.BORDER_CONSTANT,
        value=(255, 255, 255))
    texts.append(f'{"execution: "}{i}\n{names[i]}\n{ori_shapes[i]}')
    text_positions.append(
        [start_x + img.shape[1] // 2 + pad_width, max_height])
    start_x += img.shape[1] + horizontal_gap

display_img = np.concatenate(imgs, axis=1)
visualizer.set_image(display_img)
img_scale = _get_adaptive_scale(display_img.shape[:2])
visualizer.draw_texts(
    texts,
    positions=np.array(text_positions),
    font_sizes=img_scale * 7,
    colors='black',
    horizontal_alignments='center',
    font_families='monospace')
return visualizer.get_image()

class InspectCompose(Compose): """Compose multiple transforms sequentially.

And record "img" field of all results in one list.
"""

def __init__(self, transforms, intermediate_imgs):
    super().__init__(transforms=transforms)
    self.intermediate_imgs = intermediate_imgs

def __call__(self, data):
    if 'img' in data:
        self.intermediate_imgs.append({
            'name': 'original',
            'img': data['img'].copy()
        })
    self.ptransforms = [
        self.transforms[i] for i in range(len(self.transforms) - 1)
    ]
    for t in self.ptransforms:
        data = t(data)
        # Keep the same meta_keys in the PackDetInputs
        self.transforms[-1].meta_keys = [key for key in data]
        data_sample = self.transforms[-1](data)
        if data is None:
            return None
        if 'img' in data:
            self.intermediate_imgs.append({
                'name':
                t.__class__.__name__,
                'dataset_sample':
                data_sample['data_samples']
            })
    return data

def main(): args = parse_args() cfg = Config.fromfile(args.config) if args.cfg_options is not None: cfg.merge_from_dict(args.cfg_options)

# register all modules in mmyolo into the registries
register_all_modules()

dataset_cfg = cfg.get(args.phase + '_dataloader').get('dataset')
dataset = DATASETS.build(dataset_cfg)
visualizer = VISUALIZERS.build(cfg.visualizer)
visualizer.dataset_meta = dataset.metainfo
print(visualizer.dataset_meta)

intermediate_imgs = []
# print(dataset)
# print(aaaa)
# TODO: The dataset wrapper occasion is not considered here
dataset.pipeline = InspectCompose(dataset.pipeline.transforms,
                                  intermediate_imgs)

# init visualization image number
assert args.show_number > 0
display_number = min(args.show_number, len(dataset))

progress_bar = ProgressBar(display_number)
for i, item in zip(range(display_number), dataset):
    image_i = []
    result_i = [result['dataset_sample'] for result in intermediate_imgs]
    for k, datasample in enumerate(result_i):
        image = datasample.img
        gt_instances = datasample.gt_instances
        image = image[..., [2, 1, 0]]  # bgr to rgb
        gt_bboxes = gt_instances.get('bboxes', None)
        if gt_bboxes is not None and isinstance(gt_bboxes, BaseBoxes):
            gt_instances.bboxes = gt_bboxes.tensor
        gt_masks = gt_instances.get('masks', None)
        if gt_masks is not None:
            masks = mask2ndarray(gt_masks)
            gt_instances.masks = masks.astype(np.bool)
            datasample.gt_instances = gt_instances
        # get filename from dataset or just use index as filename
        visualizer.add_datasample(
            'result',
            image,
            datasample,
            draw_pred=False,
            draw_gt=True,
            show=False)
        image_show = visualizer.get_image()
        image_i.append(image_show)

    if args.mode == 'original':
        image = image_i[0]
    elif args.mode == 'transformed':
        image = image_i[-1]
    else:
        image = make_grid([result for result in image_i],
                          [result['name'] for result in intermediate_imgs])

    if hasattr(datasample, 'img_path'):
        filename = osp.basename(datasample.img_path)
    else:
        # some dataset have not image path
        filename = f'{i}.jpg'
    out_file = osp.join(args.output_dir,
                        filename) if args.output_dir is not None else None

    if out_file is not None:
        mmcv.imwrite(image[..., ::-1], out_file)

    if not args.not_show:
        visualizer.show(
            image, win_name=filename, wait_time=args.show_interval)

    intermediate_imgs.clear()
    progress_bar.update()

if name == 'main': main()

@yangxiaoyany Hi, I see that you did not add metainfo to the above configuration. you need to provide the correct configuration.

Hi @yangxiaoyany 请将 './work_dirs/yolox_s_8xb8-300e_coco' 下面最新生成的 config 贴上来看下

Hi @yangxiaoyany 请将 './work_dirs/yolox_s_8xb8-300e_coco' 下面最新生成的 config 贴上来看下

Hello, thanks for your reply. this is the newest config file in './work_dirs/yolox_s_8xb8-300e_coco'.


default_scope = 'mmyolo'
default_hooks = dict(
    timer=dict(type='IterTimerHook'),
    logger=dict(type='LoggerHook', interval=10),
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(
        type='CheckpointHook', interval=2, max_keep_ckpts=5, save_best='auto'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    visualization=dict(type='mmdet.DetVisualizationHook'))
env_cfg = dict(
    cudnn_benchmark=False,
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
    dist_cfg=dict(backend='nccl'))
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
    type='mmdet.DetLocalVisualizer',
    vis_backends=[dict(type='LocalVisBackend')],
    name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = '/home/yxy/下载/yolox_s_8xb8-300e_coco_20220917_030738-d7e60cb2.pth'
resume = False
file_client_args = dict(backend='disk')
data_root = '/home/yxy/mmdetection/data/coco/'
dataset_type = 'YOLOv5CocoDataset'
img_scale = (640, 640)
deepen_factor = 0.33
widen_factor = 0.5
save_epoch_intervals = 2
train_batch_size_per_gpu = 16
train_num_workers = 4
val_batch_size_per_gpu = 1
val_num_workers = 2
max_epochs = 300
num_last_epochs = 15
model = dict(
    type='YOLODetector',
    init_cfg=dict(
        type='Kaiming',
        layer='Conv2d',
        a=2.23606797749979,
        distribution='uniform',
        mode='fan_in',
        nonlinearity='leaky_relu'),
    use_syncbn=False,
    data_preprocessor=dict(
        type='mmdet.DetDataPreprocessor',
        pad_size_divisor=32,
        batch_augments=[
            dict(
                type='mmdet.BatchSyncRandomResize',
                random_size_range=(480, 800),
                size_divisor=32,
                interval=10)
        ]),
    backbone=dict(
        type='YOLOXCSPDarknet',
        deepen_factor=0.33,
        widen_factor=0.5,
        out_indices=(2, 3, 4),
        spp_kernal_sizes=(5, 9, 13),
        norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
        act_cfg=dict(type='SiLU', inplace=True)),
    neck=dict(
        type='YOLOXPAFPN',
        deepen_factor=0.33,
        widen_factor=0.5,
        in_channels=[256, 512, 1024],
        out_channels=256,
        norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
        act_cfg=dict(type='SiLU', inplace=True)),
    bbox_head=dict(
        type='YOLOXHead',
        head_module=dict(
            type='YOLOXHeadModule',
            num_classes=4,
            in_channels=256,
            feat_channels=256,
            widen_factor=0.5,
            stacked_convs=2,
            featmap_strides=(8, 16, 32),
            use_depthwise=False,
            norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
            act_cfg=dict(type='SiLU', inplace=True)),
        loss_cls=dict(
            type='mmdet.CrossEntropyLoss',
            use_sigmoid=True,
            reduction='sum',
            loss_weight=0.07500000000000001),
        loss_bbox=dict(
            type='mmdet.IoULoss',
            mode='square',
            eps=1e-16,
            reduction='sum',
            loss_weight=5.0),
        loss_obj=dict(
            type='mmdet.CrossEntropyLoss',
            use_sigmoid=True,
            reduction='sum',
            loss_weight=1.0),
        loss_bbox_aux=dict(
            type='mmdet.L1Loss', reduction='sum', loss_weight=1.0)),
    train_cfg=dict(
        assigner=dict(
            type='mmdet.SimOTAAssigner',
            center_radius=2.5,
            iou_calculator=dict(type='mmdet.BboxOverlaps2D'))),
    test_cfg=dict(
        yolox_style=True,
        multi_label=True,
        score_thr=0.001,
        max_per_img=300,
        nms=dict(type='nms', iou_threshold=0.65)))
pre_transform = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='LoadAnnotations', with_bbox=True)
]
train_pipeline_stage1 = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='Mosaic',
        img_scale=(320, 320),
        pad_val=114.0,
        pre_transform=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True)
        ]),
    dict(
        type='mmdet.RandomAffine',
        scaling_ratio_range=(0.1, 2),
        border=(-160, -160)),
    dict(
        type='YOLOXMixUp',
        img_scale=(320, 320),
        ratio_range=(0.8, 1.6),
        pad_val=114.0,
        pre_transform=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True)
        ]),
    dict(type='mmdet.YOLOXHSVRandomAug'),
    dict(type='mmdet.RandomFlip', prob=0.5),
    dict(
        type='mmdet.FilterAnnotations',
        min_gt_bbox_wh=(1, 1),
        keep_empty=False),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',
                   'flip_direction'))
]
train_pipeline_stage2 = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='mmdet.Resize', scale=(320, 320), keep_ratio=True),
    dict(
        type='mmdet.Pad',
        pad_to_square=True,
        pad_val=dict(img=(114.0, 114.0, 114.0))),
    dict(type='mmdet.YOLOXHSVRandomAug'),
    dict(type='mmdet.RandomFlip', prob=0.5),
    dict(
        type='mmdet.FilterAnnotations',
        min_gt_bbox_wh=(1, 1),
        keep_empty=False),
    dict(type='mmdet.PackDetInputs')
]
train_dataloader = dict(
    batch_size=16,
    num_workers=4,
    persistent_workers=True,
    pin_memory=False,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type='YOLOv5CocoDataset',
        data_root='/home/yxy/mmdetection/data/coco/',
        ann_file=
        '/home/yxy/mmdetection/data/coco/annotations/instances_train2017.json',
        data_prefix=dict(img='train2017/'),
        filter_cfg=dict(filter_empty_gt=False, min_size=32),
        pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(
                type='Mosaic',
                img_scale=(320, 320),
                pad_val=114.0,
                pre_transform=[
                    dict(
                        type='LoadImageFromFile',
                        file_client_args=dict(backend='disk')),
                    dict(type='LoadAnnotations', with_bbox=True)
                ]),
            dict(
                type='mmdet.RandomAffine',
                scaling_ratio_range=(0.1, 2),
                border=(-160, -160)),
            dict(
                type='YOLOXMixUp',
                img_scale=(320, 320),
                ratio_range=(0.8, 1.6),
                pad_val=114.0,
                pre_transform=[
                    dict(
                        type='LoadImageFromFile',
                        file_client_args=dict(backend='disk')),
                    dict(type='LoadAnnotations', with_bbox=True)
                ]),
            dict(type='mmdet.YOLOXHSVRandomAug'),
            dict(type='mmdet.RandomFlip', prob=0.5),
            dict(
                type='mmdet.FilterAnnotations',
                min_gt_bbox_wh=(1, 1),
                keep_empty=False),
            dict(
                type='mmdet.PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'flip', 'flip_direction'))
        ]))
test_pipeline = [
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
    dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
    dict(
        type='mmdet.Pad',
        pad_to_square=True,
        pad_val=dict(img=(114.0, 114.0, 114.0))),
    dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]
val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    pin_memory=False,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='YOLOv5CocoDataset',
        data_root='/home/yxy/mmdetection/data/coco/',
        ann_file=
        '/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
        data_prefix=dict(img='train2017/'),
        test_mode=True,
        pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
            dict(
                type='mmdet.Pad',
                pad_to_square=True,
                pad_val=dict(img=(114.0, 114.0, 114.0))),
            dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
            dict(
                type='mmdet.PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'scale_factor'))
        ],
        metainfo=dict(
            CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
            PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)])))
test_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    pin_memory=False,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='YOLOv5CocoDataset',
        data_root='/home/yxy/mmdetection/data/coco/',
        ann_file=
        '/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
        data_prefix=dict(img='train2017/'),
        test_mode=True,
        pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
            dict(
                type='mmdet.Pad',
                pad_to_square=True,
                pad_val=dict(img=(114.0, 114.0, 114.0))),
            dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
            dict(
                type='mmdet.PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'scale_factor'))
        ],
        metainfo=dict(
            CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
            PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)])))
val_evaluator = dict(
    type='mmdet.CocoMetric',
    proposal_nums=(100, 1, 10),
    ann_file=
    '/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
    metric='bbox')
test_evaluator = dict(
    type='mmdet.CocoMetric',
    proposal_nums=(100, 1, 10),
    ann_file=
    '/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
    metric='bbox')
base_lr = 0.01
optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=dict(
        type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005, nesterov=True),
    paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0))
param_scheduler = [
    dict(
        type='mmdet.QuadraticWarmupLR',
        by_epoch=True,
        begin=0,
        end=5,
        convert_to_iter_based=True),
    dict(
        type='CosineAnnealingLR',
        eta_min=0.0005,
        begin=5,
        T_max=285,
        end=285,
        by_epoch=True,
        convert_to_iter_based=True),
    dict(type='ConstantLR', by_epoch=True, factor=1, begin=285, end=300)
]
custom_hooks = [
    dict(
        type='YOLOXModeSwitchHook',
        num_last_epochs=15,
        new_train_pipeline=[
            dict(
                type='LoadImageFromFile',
                file_client_args=dict(backend='disk')),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
            dict(
                type='mmdet.Pad',
                pad_to_square=True,
                pad_val=dict(img=(114.0, 114.0, 114.0))),
            dict(type='mmdet.YOLOXHSVRandomAug'),
            dict(type='mmdet.RandomFlip', prob=0.5),
            dict(
                type='mmdet.FilterAnnotations',
                min_gt_bbox_wh=(1, 1),
                keep_empty=False),
            dict(type='mmdet.PackDetInputs')
        ],
        priority=48),
    dict(type='mmdet.SyncNormHook', priority=48),
    dict(
        type='EMAHook',
        ema_type='ExpMomentumEMA',
        momentum=0.0001,
        update_buffers=True,
        strict_load=False,
        priority=49)
]
train_cfg = dict(
    type='EpochBasedTrainLoop',
    max_epochs=300,
    val_interval=1,
    dynamic_intervals=[(285, 1)])
auto_scale_lr = dict(base_batch_size=64)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
work_dir = './work_dirs/my_right_yolox_s'
num_classes = 4
metainfo = dict(
    CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
    PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)])
launcher = 'none'

Hi @yangxiaoyany Please run python mmyolo/utils/collect_env.py to collect necessary environment information and paste it here. You may add addition that may be helpful for locating the problem, such as

How you installed PyTorch [e.g., pip, conda, source]
Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Hi @yangxiaoyany Please run python mmyolo/utils/collect_env.py to collect necessary environment information and paste it here. You may add addition that may be helpful for locating the problem, such as - How you installed PyTorch [e.g., pip, conda, source] - Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

the environment is a old environment to run mmdetection, I changed the mmdet and mmcv version to match the mmyolo.

sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0,1: NVIDIA GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.8
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.7.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.4 Product Build 20200917 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.1
OpenCV: 4.5.1
MMEngine: 0.3.2
MMCV: 2.0.0rc3
MMDetection: 3.0.0rc5
MMYOLO: 0.2.0+

I also found when I browse the cat dataset followed the official instruction, the output dataset shows person category in the picture, this maybe the reason loss not 0. But when I browse my dataset, there is no gtbbox. F XK3P%4KAOD{TMPLG{XN~0 W13Z1QW{7B{C44V9JYL0AL5

Hi @yangxiaoyany 因为最近 OpenMMLab 在升级一些细节问题，故请使用 mmdet3.0.0rc4

I changed the mmdet version to mmdet3.0.0rc4, there is no change.

you haven't set metainfo in train_dataloader...

you haven't set metainfo in train_dataloader...

Thank you very much. It solved the question. I forgot to add metainfo in the trainloader.

Okay, Thx for using MMYOLO 😄

you haven't set metainfo in train_dataloader...

Thank you very much. It solved the question. I forgot to add metainfo in the trainloader.

I added metainfo in the trainloader but my loss_cls and loss_bbox are still 0.0000 .

open-mmlab / mmyolo

Hello, when I train yolox with my custom coco dataset ，I found the loss of cls and reg is always 0.can you help me to solve it. #405