open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.61k stars 9.47k forks source link

TypeError: expected sequence object with len >= 0 or a single integer #2737

Closed deepaksinghcv closed 4 years ago

deepaksinghcv commented 4 years ago

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The bug has not been fixed in the latest version.

Describe the bug I'm trying to train MS-RCNN on COCO, I'm using the latest version. Single-machine-multiple-gpu training is happening successfully. But during validation I'm facing the mentioned error.

Reproduction

  1. What command or script did you run?
    python tools/train.py configs/ms_rcnn/ms_rcnn_x101_32x4d_fpn_1x_coco.py --gpus=4 --work-dir /ssd_scratch/cvit/dksingh/mmdetection_logs/ms_rcnn_x101_32x4d_fpn_1x_coco/ --resume-from=/ssd_scratch/cvit/dksingh/mmdetection_logs/ms_rcnn_x101_32x4d_fpn_1x_coco/latest.pth
  2. Did you make any modifications on the code or config? Did you understand what you have modified? No
  3. What dataset did you use? COCO 2017 dataset

Environment 2020-05-16 09:34:55,031 - mmdet - INFO - Environment info:

------------------------------------------------------------
sys.platform: linux
Python: 3.7.7 (default, May  7 2020, 21:25:33) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0,1,2,3: GeForce GTX 1080 Ti
GCC: gcc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0a0+82fd1c8
OpenCV: 4.2.0
MMCV: 0.5.1
MMDetection: 2.0.0+c802b17
MMDetection Compiler: GCC 5.5
MMDetection CUDA Compiler: 10.2
  1. You may add addition that may be helpful for locating the problem, such as
    • I installed everything as per the install.md script

Error traceback

tools/train.py", line 159, in <module>
    main()
  File "tools/train.py", line 155, in main
    meta=meta)
  File "/home/dksingh/sandbox/mmdetection/mmdet/apis/train.py", line 165, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 383, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 292, in train
    self.call_hook('after_train_epoch')
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 245, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/dksingh/sandbox/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 27, in after_train_epoch
    results = single_gpu_test(runner.model, self.dataloader, show=False)
  File "/home/dksingh/sandbox/mmdetection/mmdet/apis/test.py", line 48, in single_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 156, in forward
    return self.gather(outputs, self.output_device)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in gather
    return gather(outputs, output_device, dim=self.dim)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
    res = gather_map(outputs)
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
    return type(out)(map(gather_map, zip(*outputs)))
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
    return type(out)(map(gather_map, zip(*outputs)))
  File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
    return type(out)(map(gather_map, zip(*outputs)))
TypeError: expected sequence object with len >= 0 or a single integer

Extra Info once I start the training:

020-05-16 09:34:55,031 - mmdet - INFO - Distributed training: False
2020-05-16 09:34:55,032 - mmdet - INFO - Config:
model=dict(
    type='MaskScoringRCNN',
    pretrained='open-mmlab://resnext101_32x4d',
    backbone=dict(
        type='ResNeXt',
        depth=101,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(
            type='BN',
            requires_grad=True),
        norm_eval=True,
        style='pytorch',
        groups=32,
        base_width=4),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss',
            use_sigmoid=True,
            loss_weight=1.0),
        loss_bbox=dict(
            type='L1Loss',
            loss_weight=1.0)),
    roi_head=dict(
        type='MaskScoringRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(
                type='RoIAlign',
                out_size=7,
                sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=80,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0.0, 0.0, 0.0, 0.0],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss',
                use_sigmoid=False,
                loss_weight=1.0),
            loss_bbox=dict(
                type='L1Loss',
                loss_weight=1.0)),
        mask_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(
                type='RoIAlign',
                out_size=14,
                sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        mask_head=dict(
            type='FCNMaskHead',
            num_convs=4,
            in_channels=256,
            conv_out_channels=256,
            num_classes=80,
            loss_mask=dict(
                type='CrossEntropyLoss',
                use_mask=True,
                loss_weight=1.0)),
        mask_iou_head=dict(
            type='MaskIoUHead',
            num_convs=4,
            num_fcs=2,
            roi_feat_size=14,
            in_channels=256,
            conv_out_channels=256,
            fc_out_channels=1024,
            num_classes=80)))
train_cfg=dict(
    rpn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.7,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            match_low_quality=True,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=256,
            pos_fraction=0.5,
            neg_pos_ub=-1,
            add_gt_as_proposals=False),
        allowed_border=-1,
        pos_weight=-1,
        debug=False),
    rpn_proposal=dict(
        nms_across_levels=False,
        nms_pre=2000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.5,
            min_pos_iou=0.5,
            match_low_quality=True,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=512,
            pos_fraction=0.25,
            neg_pos_ub=-1,
            add_gt_as_proposals=True),
        mask_size=28,
        pos_weight=-1,
        debug=False,
        mask_thr_binary=0.5))
test_cfg=dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        score_thr=0.05,
        nms=dict(
            type='nms',
            iou_thr=0.5),
        max_per_img=100,
        mask_thr_binary=0.5))
dataset_type='CocoDataset'
data_root='data/coco/'
img_norm_cfg=dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True)
train_pipeline=[
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations',
        with_bbox=True,
        with_mask=True),
    dict(type='Resize',
        img_scale=(1333, 800),
        keep_ratio=True),
    dict(type='RandomFlip',
        flip_ratio=0.5),
    dict(type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad',
        size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect',
        keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])]
test_pipeline=[
    dict(type='LoadImageFromFile'),
    dict(type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize',
                keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad',
                size_divisor=32),
            dict(type='ImageToTensor',
                keys=['img']),
            dict(type='Collect',
                keys=['img'])])]
data=dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='CocoDataset',
        ann_file='data/coco/annotations/instances_train2017.json',
        img_prefix='data/coco/train2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations',
                with_bbox=True,
                with_mask=True),
            dict(type='Resize',
                img_scale=(1333, 800),
                keep_ratio=True),
            dict(type='RandomFlip',
                flip_ratio=0.5),
            dict(type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad',
                size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect',
                keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])]),
    val=dict(
        type='CocoDataset',
        ann_file='data/coco/annotations/instances_val2017.json',
        img_prefix='data/coco/val2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize',
                        keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad',
                        size_divisor=32),
                    dict(type='ImageToTensor',
                        keys=['img']),
                    dict(type='Collect',
                        keys=['img'])])]),
    test=dict(
        type='CocoDataset',
        ann_file='data/coco/annotations/instances_val2017.json',
        img_prefix='data/coco/val2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize',
                        keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad',
                        size_divisor=32),
                    dict(type='ImageToTensor',
                        keys=['img']),
                    dict(type='Collect',
                        keys=['img'])])]))
evaluation=dict(
    interval=1,
    metric=['bbox', 'segm'])
optimizer=dict(
    type='SGD',
    lr=0.02,
    momentum=0.9,
    weight_decay=0.0001)
optimizer_config=dict(
    grad_clip=None)
lr_config=dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[8, 11])
total_epochs=12
checkpoint_config=dict(
    interval=1)
log_config=dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook')])
dist_params=dict(
    backend='nccl')
log_level='INFO'
load_from=None
resume_from='/ssd_scratch/cvit/dksingh/mmdetection_logs/ms_rcnn_x101_32x4d_fpn_1x_coco/latest.pth'
workflow=[('train', 1)]
work_dir='/ssd_scratch/cvit/dksingh/mmdetection_logs/ms_rcnn_x101_32x4d_fpn_1x_coco/'
gpu_ids=range(0, 4)
hellock commented 4 years ago

To train with multiple GPUs, please refer to the documentation.

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]