[Bug] AssertionError: The `num_classes` (1) in SSDHead of MMDataParallel does not matches the length of `CLASSES` 80) in RepeatDataset

IECCLES4 commented 1 year ago

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (master) or latest version (3.x).

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmdetection

Environment

sys.platform: linux Python: 3.8.13 (default, Oct 21 2022, 23:50:54) [GCC 11.2.0] CUDA available: True GPU 0: NVIDIA GeForce GTX 1070 CUDA_HOME: /home/dtl-admin/miniconda3/envs/mmdet NVCC: Cuda compilation tools, release 11.6, V11.6.124 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.11.0 PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0 OpenCV: 4.6.0 MMCV: 1.5.3 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.3 MMDetection: 2.25.3+2b5504e

Reproduces the problem - code sample

This is my custom config that I am trying to use to train with. I have trained with these datasets before using faster_rcnn and it worked fine but now I am trying to use SSD to train and I keep getting this error.

base = '../ssd/ssd300_coco.py' model = dict( bbox_head=dict(num_classes=1)) dataset_type = 'COCODataset' classes = ('pantograph',) data = dict( train=dict( img_prefix='configs/pantograph/train/', classes=classes, ann_file='configs/pantograph/train/G-gauge_blackpool-trams_trainsim.json', dataset=dict( ann_file='configs/pantograph/train/G-gauge_blackpool-trams_trainsim.json', img_prefix='configs/pantograph/train')), val=dict( img_prefix='configs/pantograph/val/', classes=classes, ann_file='configs/pantograph/val/PantoVal.json'), test=dict( img_prefix='configs/pantograph/test/', classes=classes, ann_file='configs/pantograph/test/result.json'))

Reproduces the problem - command or script

python tools/train.py configs/pantograph/ssd_pantograph.py

Reproduces the problem - error message

Traceback (most recent call last): File "tools/train.py", line 244, in main() File "tools/train.py", line 233, in main train_detector( File "/home/dtl-admin/dev/repos/bitbucket/railsight/mmdetection-2.25.3/mmdet/apis/train.py", line 244, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/dtl-admin/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 130, in run epoch_runner(data_loaders[i], **kwargs) File "/home/dtl-admin/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 45, in train self.call_hook('before_train_epoch') File "/home/dtl-admin/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook getattr(hook, fn_name)(self) File "/home/dtl-admin/dev/repos/bitbucket/railsight/mmdetection-2.25.3/mmdet/datasets/utils.py", line 158, in before_train_epoch self._check_head(runner) File "/home/dtl-admin/dev/repos/bitbucket/railsight/mmdetection-2.25.3/mmdet/datasets/utils.py", line 144, in _check_head assert module.num_classes == len(dataset.CLASSES), \ AssertionError: The num_classes (1) in SSDHead of MMDataParallel does not matches the length of CLASSES 80) in RepeatDataset

Additional information

No response

hhaAndroid commented 1 year ago

@IECCLES4

base = '../ssd/ssd300_coco.py'
model = dict(bbox_head=dict(num_classes=1))
dataset_type = 'COCODataset'

classes = ('pantograph',)
data = dict(
train=dict(
  dataset=dict(
    classes=classes, # NOTE
    ann_file='configs/pantograph/train/G-gauge_blackpool-trams_trainsim.json',
    img_prefix='configs/pantograph/train')),

val=dict(
  img_prefix='configs/pantograph/val/',
  classes=classes,
  ann_file='configs/pantograph/val/PantoVal.json'),

test=dict(
  img_prefix='configs/pantograph/test/',
  classes=classes,
  ann_file='configs/pantograph/test/result.json'))

IECCLES4 commented 1 year ago

@IECCLES4

base = '../ssd/ssd300_coco.py'
model = dict(bbox_head=dict(num_classes=1))
dataset_type = 'COCODataset'

classes = ('pantograph',)
data = dict(
train=dict(
  dataset=dict(
    classes=classes, # NOTE
    ann_file='configs/pantograph/train/G-gauge_blackpool-trams_trainsim.json',
    img_prefix='configs/pantograph/train')),

val=dict(
  img_prefix='configs/pantograph/val/',
  classes=classes,
  ann_file='configs/pantograph/val/PantoVal.json'),

test=dict(
  img_prefix='configs/pantograph/test/',
  classes=classes,
  ann_file='configs/pantograph/test/result.json'))

Thank you but I fixed this issue a while ago I just forgot to close the issue. Thank you though

LuYang-2023 commented 1 year ago

@IECCLES4

base = '../ssd/ssd300_coco.py'
model = dict(bbox_head=dict(num_classes=1))
dataset_type = 'COCODataset'

classes = ('pantograph',)
data = dict(
train=dict(
  dataset=dict(
    classes=classes, # NOTE
    ann_file='configs/pantograph/train/G-gauge_blackpool-trams_trainsim.json',
    img_prefix='configs/pantograph/train')),

val=dict(
  img_prefix='configs/pantograph/val/',
  classes=classes,
  ann_file='configs/pantograph/val/PantoVal.json'),

test=dict(
  img_prefix='configs/pantograph/test/',
  classes=classes,
  ann_file='configs/pantograph/test/result.json'))

Thank you but I fixed this issue a while ago I just forgot to close the issue. Thank you though

I had a similar problem when I checked that there was no metainfo in the val_dataloader

open-mmlab / mmdetection