open-mmlab / OpenPCDet

OpenPCDet Toolbox for LiDAR-based 3D Object Detection.
Apache License 2.0
4.62k stars 1.29k forks source link

PV-RCNN++ training custom dataset #790

Closed luoxiaoliaolan closed 2 years ago

luoxiaoliaolan commented 2 years ago

Hi! Thank you so much for releasing the latest pv-rcnn++ code,I try to train my own data with this code. I organized my own data according to the format of the kitti dataset. I follow "pv_rcnn_plusplus.yaml" from waymo_models folder to custom_training_model_cfg. But when I start training, I get an error like this: image

Also, I provide information about custom dataset: The dimensions of point cloud data: [x, y, z, intensity]

According to the displayed error message, I am located models/backbones_3d/pfe/voxel_set_abstraction.py line 375, function self.aggregate_keypoint_features_from_one_source,
xyz_features=raw_points[:, 4:].contiguous() if raw_points.shape[1] > 4 else None, where xyz_features' s shape is (246186, 1)
It caused the code here(pointnet2_modules.py line 396 N, C = features.shape assert C % self.num_reduced_channels == 0, \ f'the input channels ({C}) should be an integral multiple of num_reduced_channels({self.num_reduced_channels})' ) to report an error. image

I feel that this may be caused by the inconsistent dimensional information between my own data and the data of the waymo dataset. Can you tell me how to modify the model configuration (pv_rcnn_plusplus.yaml) to fit custom dataset? (I have read this issue: https://github.com/open-mmlab/OpenPCDet/issues/747) At the same time, I also want to try to adapt the models trained on the waymo dataset to the format of the kitti dataset. Can you help?

          Among the questions raised above, I look forward to communicating with you more!
                                                          Thank a lot!
sshaoshuai commented 2 years ago

Hi,

Please try to modify the line of SA_LAYER.raw_points.NUM_REDUCED_CHANNELS=2 as follows:

NUM_REDUCED_CHANNELS: 1

Since your own data just has points with shape (N, 3+1) instead of (N, 3+2) like waymo dataset.

luoxiaoliaolan commented 2 years ago

Thanks for your reply, as per your suggestion, I try to modify the line of SA_LAYER.raw_points.NUM_REDUCED_CHANNELS=1: image When I trained custom dataset, I got this error: image Do you think I should continue modifying the parameters of the model configuration(pv_rcnn_plusplus.yaml) or should I start by building my own dataset? Attach the configuration of custom dataset: `DATASET: 'AsDataset' DATA_PATH: '/mnt/NAS/X01_datasets/as_label_data/AS_annotation_20/113'

POINT_CLOUD_RANGE: [0.0, -40, -3, 128, 40, 1]

DATA_SPLIT: { 'train': train, 'test': val }

INFO_PATH: { 'train': [as_infos_train.pkl], 'test': [as_infos_val.pkl], }

BALANCED_RESAMPLING: True

GET_ITEM_LIST: ["points"] FOV_POINTS_ONLY: True

DATA_AUGMENTOR: DISABLE_AUG_LIST: ['placeholder'] AUG_CONFIG_LIST:

POINT_FEATURE_ENCODING: { encoding_type: absolute_coordinates_encoding, used_feature_list: ['x', 'y', 'z', 'intensity'], src_feature_list: ['x', 'y', 'z', 'intensity'], }

DATA_PROCESSOR:

luoxiaoliaolan commented 2 years ago

Attach pv_rcnn_plusplus.yaml: `CLASS_NAMES: ['car', 'pedestrian', 'bicycle', 'tricycle', 'cyclist', 'motorcyclist', 'tricyclist', 'van', 'bus', 'truck', 'mini_truck', 'special_vehicle', 'traffic_cone', 'small_movable', 'small_unmovable', 'crash_barrel', 'construction_sign', 'noise', 'water_horse', 'other']

DATA_CONFIG: _BASECONFIG: cfgs/dataset_configs/as_dataset.yaml

MODEL: NAME: PVRCNNPlusPlus

VFE:
    NAME: MeanVFE

BACKBONE_3D:
    NAME: VoxelBackBone8x

MAP_TO_BEV:
    NAME: HeightCompression
    NUM_BEV_FEATURES: 256

BACKBONE_2D:
    NAME: BaseBEVBackbone

    LAYER_NUMS: [5, 5]
    LAYER_STRIDES: [1, 2]
    NUM_FILTERS: [128, 256]
    UPSAMPLE_STRIDES: [1, 2]
    NUM_UPSAMPLE_FILTERS: [256, 256]

DENSE_HEAD:
    NAME: CenterHead
    CLASS_AGNOSTIC: False

    CLASS_NAMES_EACH_HEAD: [
        [ 'car', 'pedestrian', 'bicycle', 'tricycle', 'cyclist', 'motorcyclist', 'tricyclist',
               'van', 'bus', 'truck', 'mini_truck', 'special_vehicle', 'traffic_cone', 'small_movable',
               'small_unmovable', 'crash_barrel', 'construction_sign', 'noise', 'water_horse', 'other' ]
    ]

    SHARED_CONV_CHANNEL: 64
    USE_BIAS_BEFORE_NORM: True
    NUM_HM_CONV: 2
    SEPARATE_HEAD_CFG:
        HEAD_ORDER: [ 'center', 'center_z', 'dim', 'rot' ]
        HEAD_DICT: {
            'center': { 'out_channels': 2, 'num_conv': 2 },
            'center_z': { 'out_channels': 1, 'num_conv': 2 },
            'dim': { 'out_channels': 3, 'num_conv': 2 },
            'rot': { 'out_channels': 2, 'num_conv': 2 },
        }

    TARGET_ASSIGNER_CONFIG:
        FEATURE_MAP_STRIDE: 8
        NUM_MAX_OBJS: 500
        GAUSSIAN_OVERLAP: 0.1
        MIN_RADIUS: 2

    LOSS_CONFIG:
        LOSS_WEIGHTS: {
            'cls_weight': 1.0,
            'loc_weight': 2.0,
            'code_weights': [ 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0 ]
        }

    POST_PROCESSING:
        SCORE_THRESH: 0.5
        POST_CENTER_LIMIT_RANGE: [ -75.2, -75.2, -2, 75.2, 75.2, 4 ]
        MAX_OBJ_PER_SAMPLE: 500
        NMS_CONFIG:
            NMS_TYPE: nms_gpu
            NMS_THRESH: 0.7
            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 500

PFE:
    NAME: VoxelSetAbstraction
    POINT_SOURCE: raw_points
    NUM_KEYPOINTS: 4096
    NUM_OUTPUT_FEATURES: 90
    SAMPLE_METHOD: SPC
    SPC_SAMPLING:
        NUM_SECTORS: 6
        SAMPLE_RADIUS_WITH_ROI: 1.6

    FEATURES_SOURCE: ['bev', 'x_conv3', 'x_conv4', 'raw_points']
    SA_LAYER:
        raw_points:
            NAME: VectorPoolAggregationModuleMSG
            NUM_GROUPS: 2
            LOCAL_AGGREGATION_TYPE: local_interpolation
            NUM_REDUCED_CHANNELS: 1
            NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
            MSG_POST_MLPS: [ 32 ]
            FILTER_NEIGHBOR_WITH_ROI: True
            RADIUS_OF_NEIGHBOR_WITH_ROI: 2.4

            GROUP_CFG_0:
                NUM_LOCAL_VOXEL: [ 2, 2, 2 ]
                MAX_NEIGHBOR_DISTANCE: 0.2
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 32, 32 ]
            GROUP_CFG_1:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 0.4
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 32, 32 ]

        x_conv3:
            DOWNSAMPLE_FACTOR: 4
            INPUT_CHANNELS: 64

            NAME: VectorPoolAggregationModuleMSG
            NUM_GROUPS: 2
            LOCAL_AGGREGATION_TYPE: local_interpolation
            NUM_REDUCED_CHANNELS: 32
            NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
            MSG_POST_MLPS: [128]
            FILTER_NEIGHBOR_WITH_ROI: True
            RADIUS_OF_NEIGHBOR_WITH_ROI: 4.0

            GROUP_CFG_0:
                NUM_LOCAL_VOXEL: [3, 3, 3]
                MAX_NEIGHBOR_DISTANCE: 1.2
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [64, 64]
            GROUP_CFG_1:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 2.4
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 64, 64 ]

        x_conv4:
            DOWNSAMPLE_FACTOR: 8
            INPUT_CHANNELS: 64

            NAME: VectorPoolAggregationModuleMSG
            NUM_GROUPS: 2
            LOCAL_AGGREGATION_TYPE: local_interpolation
            NUM_REDUCED_CHANNELS: 32
            NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
            MSG_POST_MLPS: [ 128 ]
            FILTER_NEIGHBOR_WITH_ROI: True
            RADIUS_OF_NEIGHBOR_WITH_ROI: 6.4

            GROUP_CFG_0:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 2.4
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 64, 64 ]
            GROUP_CFG_1:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 4.8
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 64, 64 ]

POINT_HEAD:
    NAME: PointHeadSimple
    CLS_FC: [256, 256]
    CLASS_AGNOSTIC: True
    USE_POINT_FEATURES_BEFORE_FUSION: True
    TARGET_CONFIG:
        GT_EXTRA_WIDTH: [0.2, 0.2, 0.2]
    LOSS_CONFIG:
        LOSS_REG: smooth-l1
        LOSS_WEIGHTS: {
            'point_cls_weight': 1.0,
        }

ROI_HEAD:
    NAME: PVRCNNHead
    CLASS_AGNOSTIC: True

    SHARED_FC: [256, 256]
    CLS_FC: [256, 256]
    REG_FC: [256, 256]
    DP_RATIO: 0.3

    NMS_CONFIG:
        TRAIN:
            NMS_TYPE: nms_gpu
            MULTI_CLASSES_NMS: False
            NMS_PRE_MAXSIZE: 9000
            NMS_POST_MAXSIZE: 512
            NMS_THRESH: 0.8
        TEST:
            NMS_TYPE: nms_gpu
            MULTI_CLASSES_NMS: False
            NMS_PRE_MAXSIZE: 1024
            NMS_POST_MAXSIZE: 100
            NMS_THRESH: 0.7
            SCORE_THRESH: 0.5

NMS_PRE_MAXSIZE: 4096

NMS_POST_MAXSIZE: 500

NMS_THRESH: 0.85

    ROI_GRID_POOL:
        GRID_SIZE: 6

        NAME: VectorPoolAggregationModuleMSG
        NUM_GROUPS: 2
        LOCAL_AGGREGATION_TYPE: voxel_random_choice
        NUM_REDUCED_CHANNELS: 30
        NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
        MSG_POST_MLPS: [ 128 ]

        GROUP_CFG_0:
            NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
            MAX_NEIGHBOR_DISTANCE: 0.8
            NEIGHBOR_NSAMPLE: 32
            POST_MLPS: [ 64, 64 ]
        GROUP_CFG_1:
            NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
            MAX_NEIGHBOR_DISTANCE: 1.6
            NEIGHBOR_NSAMPLE: 32
            POST_MLPS: [ 64, 64 ]

    TARGET_CONFIG:
        BOX_CODER: ResidualCoder
        ROI_PER_IMAGE: 128
        FG_RATIO: 0.5

        SAMPLE_ROI_BY_EACH_CLASS: True
        CLS_SCORE_TYPE: roi_iou

        CLS_FG_THRESH: 0.75
        CLS_BG_THRESH: 0.25
        CLS_BG_THRESH_LO: 0.1
        HARD_BG_RATIO: 0.8

        REG_FG_THRESH: 0.55

    LOSS_CONFIG:
        CLS_LOSS: BinaryCrossEntropy
        REG_LOSS: smooth-l1
        CORNER_LOSS_REGULARIZATION: True
        LOSS_WEIGHTS: {
            'rcnn_cls_weight': 1.0,
            'rcnn_reg_weight': 1.0,
            'rcnn_corner_weight': 1.0,
            'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
        }

POST_PROCESSING:
    RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
    SCORE_THRESH: 0.5
    OUTPUT_RAW_SCORE: False

    EVAL_METRIC: kitti

    NMS_CONFIG:
        MULTI_CLASSES_NMS: False
        NMS_TYPE: nms_gpu
        NMS_THRESH: 0.7
        NMS_PRE_MAXSIZE: 4096
        NMS_POST_MAXSIZE: 500

OPTIMIZATION: BATCH_SIZE_PER_GPU: 1 NUM_EPOCHS: 80

OPTIMIZER: adam_onecycle
LR: 0.01
WEIGHT_DECAY: 0.001
MOMENTUM: 0.9

MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001

LR_WARMUP: False
WARMUP_EPOCH: 1

GRAD_NORM_CLIP: 10`
sshaoshuai commented 2 years ago

This error is probably caused by the data.

The error indicates that there are some training cases with only 1 keypoint in the beginning of your training (the RoIs are far away from any points). You can try to modify this line https://github.com/open-mmlab/OpenPCDet/blob/master/pcdet/models/backbones_3d/pfe/voxel_set_abstraction.py#L73 to

sampled_points = points[:2] if point_mask.sum() == 0 else points[point_mask, :]

which can skip this error in the beginning of training.

luoxiaoliaolan commented 2 years ago

Thanks for your patience!But I still have this problem while training: `&&&&&&&&&&&&&&&&&&&&&&&&&&&info_path:__ /mnt/pfs/jarvis/at_dataset/at_infos_train.pkl

epochs: 0%| | 0/80 [00:00<?, ?it/s]/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/models/model_utils/centernet_utils.py:142: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). topk_ys = (topk_inds // width).float() /output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/models/model_utils/centernet_utils.py:146: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). topk_classes = (topk_ind // K).int() /opt/conda/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1634272178570/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

Traceback (most recent call last): File "tools/train.py", line 204, in main() File "tools/train.py", line 173, in main merge_all_iters_to_one_epoch=args.merge_all_iters_to_one_epoch File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/tools/train_utils/train_utils.py", line 118, in train_model dataloader_iter=dataloader_iter File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/tools/train_utils/train_utils.py", line 47, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/models/init.py", line 42, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward output = self.module(*inputs[0], *kwargs[0]) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/models/detectors/pv_rcnn_plusplus.py", line 28, in forward batch_dict = self.pfe(batch_dict) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/models/backbones_3d/pfe/voxel_set_abstraction.py", line 380, in forward rois=batch_dict.get('rois', None) File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/models/backbones_3d/pfe/voxel_set_abstraction.py", line 330, in aggregate_keypoint_features_from_one_source features=xyz_features.contiguous(), File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/ops/pointnet2/pointnet2_stack/pointnet2_modules.py", line 461, in forward cur_xyz, curfeatures = self.getattr(f'layer{k}')(kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/output/train-result/f68eb42e-29f5-4a20-8465-f29bb3b8fac8/code_2/pcdet/ops/pointnet2/pointnet2_stack/pointnet2_modules.py", line 399, in forward features = features.view(N, -1, self.num_reduced_channels).sum(dim=1) RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 1] because the unspecified dimension size -1 can be any value and is ambiguous ========trainExitCode======== 1`

Code: image

pv_rcnn_plusplus.yaml: CLASS_NAMES: ['car', 'pedestrian', 'bicycle', 'tricycle', 'cyclist', 'motorcyclist', 'tricyclist', 'van', 'bus', 'truck', 'mini_truck', 'special_vehicle', 'traffic_cone', 'small_movable', 'small_unmovable', 'crash_barrel', 'construction_sign', 'noise', 'water_horse', 'other']

DATA_CONFIG: _BASECONFIG: tools/cfgs/dataset_configs/as_dataset.yaml

MODEL: NAME: PVRCNNPlusPlus

VFE:
    NAME: MeanVFE

BACKBONE_3D:
    NAME: VoxelBackBone8x

MAP_TO_BEV:
    NAME: HeightCompression
    NUM_BEV_FEATURES: 256

BACKBONE_2D:
    NAME: BaseBEVBackbone

    LAYER_NUMS: [5, 5]
    LAYER_STRIDES: [1, 2]
    NUM_FILTERS: [128, 256]
    UPSAMPLE_STRIDES: [1, 2]
    NUM_UPSAMPLE_FILTERS: [256, 256]

DENSE_HEAD:
    NAME: CenterHead
    CLASS_AGNOSTIC: False

    CLASS_NAMES_EACH_HEAD: [
        [ 'car', 'pedestrian', 'bicycle', 'tricycle', 'cyclist', 'motorcyclist', 'tricyclist',
               'van', 'bus', 'truck', 'mini_truck', 'special_vehicle', 'traffic_cone', 'small_movable',
               'small_unmovable', 'crash_barrel', 'construction_sign', 'noise', 'water_horse', 'other' ]
    ]

    SHARED_CONV_CHANNEL: 64
    USE_BIAS_BEFORE_NORM: True
    NUM_HM_CONV: 2
    SEPARATE_HEAD_CFG:
        HEAD_ORDER: [ 'center', 'center_z', 'dim', 'rot' ]
        HEAD_DICT: {
            'center': { 'out_channels': 2, 'num_conv': 2 },
            'center_z': { 'out_channels': 1, 'num_conv': 2 },
            'dim': { 'out_channels': 3, 'num_conv': 2 },
            'rot': { 'out_channels': 2, 'num_conv': 2 },
        }

    TARGET_ASSIGNER_CONFIG:
        FEATURE_MAP_STRIDE: 8
        NUM_MAX_OBJS: 500
        GAUSSIAN_OVERLAP: 0.1
        MIN_RADIUS: 2

    LOSS_CONFIG:
        LOSS_WEIGHTS: {
            'cls_weight': 1.0,
            'loc_weight': 2.0,
            'code_weights': [ 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0 ]
        }

    POST_PROCESSING:
        SCORE_THRESH: 0.5
        POST_CENTER_LIMIT_RANGE: [ 0.0, -40.0, -2.0, 128.0, 40.0, 2.0 ]
        MAX_OBJ_PER_SAMPLE: 500
        NMS_CONFIG:
            NMS_TYPE: nms_gpu
            NMS_THRESH: 0.7
            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 500

PFE:
    NAME: VoxelSetAbstraction
    POINT_SOURCE: raw_points
    NUM_KEYPOINTS: 4096
    NUM_OUTPUT_FEATURES: 90
    SAMPLE_METHOD: SPC
    SPC_SAMPLING:
        NUM_SECTORS: 6
        SAMPLE_RADIUS_WITH_ROI: 1.6

    FEATURES_SOURCE: ['bev', 'x_conv3', 'x_conv4', 'raw_points']
    SA_LAYER:
        raw_points:
            NAME: VectorPoolAggregationModuleMSG
            NUM_GROUPS: 2
            LOCAL_AGGREGATION_TYPE: local_interpolation
            NUM_REDUCED_CHANNELS: 1
            NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
            MSG_POST_MLPS: [ 32 ]
            FILTER_NEIGHBOR_WITH_ROI: True
            RADIUS_OF_NEIGHBOR_WITH_ROI: 2.4

            GROUP_CFG_0:
                NUM_LOCAL_VOXEL: [ 2, 2, 2 ]
                MAX_NEIGHBOR_DISTANCE: 0.2
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 32, 32 ]
            GROUP_CFG_1:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 0.4
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 32, 32 ]

        x_conv3:
            DOWNSAMPLE_FACTOR: 4
            INPUT_CHANNELS: 64

            NAME: VectorPoolAggregationModuleMSG
            NUM_GROUPS: 2
            LOCAL_AGGREGATION_TYPE: local_interpolation
            NUM_REDUCED_CHANNELS: 32
            NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
            MSG_POST_MLPS: [128]
            FILTER_NEIGHBOR_WITH_ROI: True
            RADIUS_OF_NEIGHBOR_WITH_ROI: 4.0

            GROUP_CFG_0:
                NUM_LOCAL_VOXEL: [3, 3, 3]
                MAX_NEIGHBOR_DISTANCE: 1.2
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [64, 64]
            GROUP_CFG_1:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 2.4
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 64, 64 ]

        x_conv4:
            DOWNSAMPLE_FACTOR: 8
            INPUT_CHANNELS: 64

            NAME: VectorPoolAggregationModuleMSG
            NUM_GROUPS: 2
            LOCAL_AGGREGATION_TYPE: local_interpolation
            NUM_REDUCED_CHANNELS: 32
            NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
            MSG_POST_MLPS: [ 128 ]
            FILTER_NEIGHBOR_WITH_ROI: True
            RADIUS_OF_NEIGHBOR_WITH_ROI: 6.4

            GROUP_CFG_0:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 2.4
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 64, 64 ]
            GROUP_CFG_1:
                NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
                MAX_NEIGHBOR_DISTANCE: 4.8
                NEIGHBOR_NSAMPLE: -1
                POST_MLPS: [ 64, 64 ]

POINT_HEAD:
    NAME: PointHeadSimple
    CLS_FC: [256, 256]
    CLASS_AGNOSTIC: True
    USE_POINT_FEATURES_BEFORE_FUSION: True
    TARGET_CONFIG:
        GT_EXTRA_WIDTH: [0.2, 0.2, 0.2]
    LOSS_CONFIG:
        LOSS_REG: smooth-l1
        LOSS_WEIGHTS: {
            'point_cls_weight': 1.0,
        }

ROI_HEAD:
    NAME: PVRCNNHead
    CLASS_AGNOSTIC: True

    SHARED_FC: [256, 256]
    CLS_FC: [256, 256]
    REG_FC: [256, 256]
    DP_RATIO: 0.3

    NMS_CONFIG:
        TRAIN:
            NMS_TYPE: nms_gpu
            MULTI_CLASSES_NMS: False
            NMS_PRE_MAXSIZE: 9000
            NMS_POST_MAXSIZE: 512
            NMS_THRESH: 0.8
        TEST:
            NMS_TYPE: nms_gpu
            MULTI_CLASSES_NMS: False
            NMS_PRE_MAXSIZE: 1024
            NMS_POST_MAXSIZE: 100
            NMS_THRESH: 0.7
            SCORE_THRESH: 0.5

NMS_PRE_MAXSIZE: 4096

NMS_POST_MAXSIZE: 500

NMS_THRESH: 0.85

    ROI_GRID_POOL:
        GRID_SIZE: 6

        NAME: VectorPoolAggregationModuleMSG
        NUM_GROUPS: 2
        LOCAL_AGGREGATION_TYPE: voxel_random_choice
        NUM_REDUCED_CHANNELS: 30
        NUM_CHANNELS_OF_LOCAL_AGGREGATION: 32
        MSG_POST_MLPS: [ 128 ]

        GROUP_CFG_0:
            NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
            MAX_NEIGHBOR_DISTANCE: 0.8
            NEIGHBOR_NSAMPLE: 32
            POST_MLPS: [ 64, 64 ]
        GROUP_CFG_1:
            NUM_LOCAL_VOXEL: [ 3, 3, 3 ]
            MAX_NEIGHBOR_DISTANCE: 1.6
            NEIGHBOR_NSAMPLE: 32
            POST_MLPS: [ 64, 64 ]

    TARGET_CONFIG:
        BOX_CODER: ResidualCoder
        ROI_PER_IMAGE: 128
        FG_RATIO: 0.5

        SAMPLE_ROI_BY_EACH_CLASS: True
        CLS_SCORE_TYPE: roi_iou

        CLS_FG_THRESH: 0.75
        CLS_BG_THRESH: 0.25
        CLS_BG_THRESH_LO: 0.1
        HARD_BG_RATIO: 0.8

        REG_FG_THRESH: 0.55

    LOSS_CONFIG:
        CLS_LOSS: BinaryCrossEntropy
        REG_LOSS: smooth-l1
        CORNER_LOSS_REGULARIZATION: True
        LOSS_WEIGHTS: {
            'rcnn_cls_weight': 1.0,
            'rcnn_reg_weight': 1.0,
            'rcnn_corner_weight': 1.0,
            'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
        }

POST_PROCESSING:
    RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
    SCORE_THRESH: 0.5
    OUTPUT_RAW_SCORE: False

    EVAL_METRIC: kitti

    NMS_CONFIG:
        MULTI_CLASSES_NMS: False
        NMS_TYPE: nms_gpu
        NMS_THRESH: 0.7
        NMS_PRE_MAXSIZE: 4096
        NMS_POST_MAXSIZE: 500

OPTIMIZATION: BATCH_SIZE_PER_GPU: 2 NUM_EPOCHS: 80

OPTIMIZER: adam_onecycle
LR: 0.01
WEIGHT_DECAY: 0.001
MOMENTUM: 0.9

MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001

LR_WARMUP: False
WARMUP_EPOCH: 1

GRAD_NORM_CLIP: 10

custom dataset cfgs: DATASET: 'AsDataset' DATA_PATH: '/root/NAS/X01_datasets/as_label_data'

POINT_CLOUD_RANGE: [-100.0, -60.0, -2, 120, 60.0, 4]

POINT_CLOUD_RANGE: [0.0, -40.0, -2, 100, 40.0, 4]

POINT_CLOUD_RANGE: [0.0, -40.0, -2.0, 128.0, 40.0, 2.0] # xmin, ymin, zmin, xmax, ymax, zmax

DATA_SPLIT: { 'train': train, 'test': val }

INFO_PATH: { 'train': [at_infos_train.pkl], 'test': [at_infos_val.pkl], }

BALANCED_RESAMPLING: True

GET_ITEM_LIST: ["points"] FOV_POINTS_ONLY: True

DATA_AUGMENTOR: DISABLE_AUG_LIST: ['placeholder'] AUG_CONFIG_LIST:

POINT_FEATURE_ENCODING: { encoding_type: absolute_coordinates_encoding, used_feature_list: ['x', 'y', 'z', 'intensity'], src_feature_list: ['x', 'y', 'z', 'intensity'], }

DATA_PROCESSOR:

Should I modify that module to solve this training problem?

luoxiaoliaolan commented 2 years ago

request to reopen this issue@sshaoshuai

sshaoshuai commented 2 years ago

Sorry, just notice this issue.

I didn't try PV-RCNN++ with so many classes and I suggest you to step by step check the error, and I guess it is still the problem of empty RoIs (without any points).

eriche2016 commented 2 years ago

So if the roi is empty, we should also change point_mask, right? (this line), since it will be used to get the point features within the roi.

BOBrown commented 2 years ago

@luoxiaoliaolan I have met the same issue, and I solved this issue by setting the NUM_POINTS_OF_EACH_SAMPLE_PART in pv_rcnn_plusplus.yaml:

PFE: NAME: VoxelSetAbstraction POINT_SOURCE: raw_points NUM_KEYPOINTS: 4096 NUM_OUTPUT_FEATURES: 90 SAMPLE_METHOD: SPC SPC_SAMPLING: NUM_POINTS_OF_EACH_SAMPLE_PART: 4000000 NUM_SECTORS: 6 SAMPLE_RADIUS_WITH_ROI: 1.6

luoxiaoliaolan commented 1 year ago

@luoxiaoliaolan I have met the same issue, and I solved this issue by setting the NUM_POINTS_OF_EACH_SAMPLE_PART in pv_rcnn_plusplus.yaml:

PFE: NAME: VoxelSetAbstraction POINT_SOURCE: raw_points NUM_KEYPOINTS: 4096 NUM_OUTPUT_FEATURES: 90 SAMPLE_METHOD: SPC SPC_SAMPLING: NUM_POINTS_OF_EACH_SAMPLE_PART: 4000000 NUM_SECTORS: 6 SAMPLE_RADIUS_WITH_ROI: 1.6

@BOBrown Thank you for your suggestion, but recently I encountered other problems when training this model, can you leave a contact information to communicate?

frothmoon commented 9 months ago

@luoxiaoliaolan Hello, did you solve this problem? I also encountered the same problem, looking forward to your reply.