open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.18k stars 1.52k forks source link

Got problem training PointPillars on sunrgbd #605

Closed neverrop closed 3 years ago

neverrop commented 3 years ago

I'm trying to train Pointpillars on dataset sunrgbd, I'm using config file hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class.py, and did some modification as following:

  1. use base data config: sunrgbd-3d-10class.py
  2. for model config, I'm setting:
    point_cloud_range = [0, -2, -3, 4, 2, 1]
    num_classes=10
    anchor_generator=dict(
            type="Anchor3DRangeGenerator",
            ranges=[[0, -2, -0.6, 4, 2, -0.6]],
            sizes=[[0.59, 0.55, 0.83]],
            rotations=[0, 1.57],
            reshape_out=False,
        ),

    then while runing, in function loss_single(Anchor3DHead), I got cls_score.shape[0] != labels.shape[0], which casuse error when calling focal loss. Is there something wrong in my config setting? or what else it might be..

my while model setting is:

voxel_size = [0.025, 0.025, 4]
point_cloud_range = [0, -2, -3, 4, 2, 1]  # zxy

model = dict(
    type="VoxelNet",
    voxel_layer=dict(
        max_num_points=32,  # max_points_per_voxel
        point_cloud_range=point_cloud_range,
        voxel_size=voxel_size,
        max_voxels=(16000, 40000),  # (training, testing) max_voxels
    ),
    voxel_encoder=dict(
        type="PillarFeatureNet",
        in_channels=4,
        feat_channels=[64],
        with_distance=False,
        voxel_size=voxel_size,
        point_cloud_range=point_cloud_range,
    ),
    middle_encoder=dict(
        type="PointPillarsScatter", in_channels=64, output_shape=[496, 432]
    ),
    backbone=dict(
        type="SECOND",
        in_channels=64,
        layer_nums=[3, 5, 5],
        layer_strides=[2, 2, 2],
        out_channels=[64, 128, 256],
    ),
    neck=dict(
        type="SECONDFPN",
        in_channels=[64, 128, 256],
        upsample_strides=[1, 2, 4],
        out_channels=[128, 128, 128],
    ),
    bbox_head=dict(
        type="Anchor3DHead",
        num_classes=10,
        in_channels=384,
        feat_channels=384,
        use_direction_classifier=True,
        anchor_generator=dict(
            type="Anchor3DRangeGenerator",
            ranges=[[0, -2, -0.6, 4, 2, -0.6]],
            sizes=[[0.59, 0.55, 0.83]],
            rotations=[0, 1.57],
            reshape_out=False,
        ),
        diff_rad_by_sin=True,
        bbox_coder=dict(type="DeltaXYZWLHRBBoxCoder"),
        loss_cls=dict(
            type="FocalLoss", use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0
        ),
        loss_bbox=dict(type="SmoothL1Loss", beta=1.0 / 9.0, loss_weight=2.0),
        loss_dir=dict(type="CrossEntropyLoss", use_sigmoid=False, loss_weight=0.2),
    ),
    # model training and testing settings
    train_cfg=dict(
        assigner=[
            dict(  # for Pedestrian
                type="MaxIoUAssigner",
                iou_calculator=dict(type="BboxOverlapsNearest3D"),
                pos_iou_thr=0.5,
                neg_iou_thr=0.35,
                min_pos_iou=0.35,
                ignore_iof_thr=-1,
            )
        ],
        allowed_border=0,
        pos_weight=-1,
        debug=False,
    ),
    test_cfg=dict(
        use_rotate_nms=True,
        nms_across_levels=False,
        nms_thr=0.01,
        score_thr=0.1,
        min_bbox_size=0,
        nms_pre=100,
        max_num=50,
    ),
)
hi-zhangjie commented 3 years ago

Hello,have you solved the problem?

ZwwWayne commented 3 years ago

It is inappropriate to train PointPillars on indoor datasets because the objects will stack in height and PointPillars/SECOND cannot handle that.