Closed dinvincible98 closed 12 months ago
Did you train successfully on the kitti? How did it turn out?
Did you train successfully on the kitti? How did it turn out?
No, I always get Nan or Inf error during trainning. I guess there are some hyperparameter issues.
Did you train successfully on the kitti? How did it turn out?
No, I always get Nan or Inf error during trainning. I guess there are some hyperparameter issues.
I will try to run this code on the KITTI in the future when I have some free time.
Very sorry for the late reply, I'm rushing some ddls. We haven't tried kitti dataset. You can see if issue59 will be helpful.
Did you train successfully on the kitti? How did it turn out?
No, I always get Nan or Inf error during trainning. I guess there are some hyperparameter issues.
I will try to run this code on the KITTI in the future when I have some free time.
There are some data augumentor issues, here's the modified config, you can try this to see if get Nan or Inf error:
CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']
DATA_CONFIG:
_BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml
POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True
- NAME: shuffle_points
SHUFFLE_ENABLED: {
'train': True,
'test': False
}
- NAME: transform_points_to_voxels
VOXEL_SIZE: [0.1477, 0.1696, 4]
MAX_POINTS_PER_VOXEL: 32
MAX_NUMBER_OF_VOXELS: {
'train': 16000,
'test': 40000
}
DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['placeholder']
AUG_CONFIG_LIST:
- NAME: gt_sampling
USE_ROAD_PLANE: True
DB_INFO_PATH:
- kitti_dbinfos_train.pkl
PREPARE: {
filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
filter_by_difficulty: [-1],
}
SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
NUM_POINT_FEATURES: 4
DATABASE_WITH_FAKELIDAR: False
REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
LIMIT_WHOLE_SCENE: False
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x']
- NAME: random_world_rotation
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
- NAME: random_world_scaling
WORLD_SCALE_RANGE: [0.95, 1.05]
MODEL:
NAME: CenterPoint
VFE:
NAME: DynPillarVFE3D
WITH_DISTANCE: False
USE_ABSLOTE_XYZ: True
USE_NORM: True
NUM_FILTERS: [ 192, 192 ]
BACKBONE_3D:
NAME: DSVT
INPUT_LAYER:
sparse_shape: [468, 468, 1]
downsample_stride: []
d_model: [192]
set_info: [[36, 4]]
window_shape: [[12, 12, 1]]
hybrid_factor: [2, 2, 1] # x, y, z
shifts_list: [[[0, 0, 0], [6, 6, 0]]]
normalize_pos: False
block_name: ['DSVTBlock']
set_info: [[36, 4]]
d_model: [192]
nhead: [8]
dim_feedforward: [384]
dropout: 0.0
activation: gelu
output_shape: [468, 468]
conv_out_channel: 192
# ues_checkpoint: True
MAP_TO_BEV:
NAME: PointPillarScatter3d
INPUT_SHAPE: [468, 468, 1]
NUM_BEV_FEATURES: 192
BACKBONE_2D:
NAME: BaseBEVResBackbone
LAYER_NUMS: [ 1, 2, 2 ]
LAYER_STRIDES: [ 1, 2, 2 ]
NUM_FILTERS: [ 128, 128, 256 ]
UPSAMPLE_STRIDES: [ 1, 2, 4 ]
NUM_UPSAMPLE_FILTERS: [ 128, 128, 128 ]
DENSE_HEAD:
NAME: CenterHead
CLASS_AGNOSTIC: False
CLASS_NAMES_EACH_HEAD: [
['Car', 'Pedestrian', 'Cyclist']
]
SHARED_CONV_CHANNEL: 64
USE_BIAS_BEFORE_NORM: True
NUM_HM_CONV: 2
BN_EPS: 0.001
BN_MOM: 0.01
SEPARATE_HEAD_CFG:
HEAD_ORDER: ['center', 'center_z', 'dim', 'rot']
HEAD_DICT: {
'center': {'out_channels': 2, 'num_conv': 2},
'center_z': {'out_channels': 1, 'num_conv': 2},
'dim': {'out_channels': 3, 'num_conv': 2},
'rot': {'out_channels': 2, 'num_conv': 2},
'iou': {'out_channels': 1, 'num_conv': 2},
}
TARGET_ASSIGNER_CONFIG:
FEATURE_MAP_STRIDE: 1
NUM_MAX_OBJS: 500
GAUSSIAN_OVERLAP: 0.1
MIN_RADIUS: 2
IOU_REG_LOSS: True
LOSS_CONFIG:
LOSS_WEIGHTS: {
'cls_weight': 1.0,
'loc_weight': 2.0,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
SCORE_THRESH: 0.1
OUTPUT_RAW_SCORE: False
POST_CENTER_LIMIT_RANGE: [0, -40, -3, 75, 40, 1]
MAX_OBJ_PER_SAMPLE: 500
EVAL_METRIC: kitti
NMS_CONFIG:
MULTI_CLASSES_NMS: False
NMS_TYPE: nms_gpu
NMS_THRESH: 0.01
NMS_PRE_MAXSIZE: 4096
NMS_POST_MAXSIZE: 500
OPTIMIZATION:
BATCH_SIZE_PER_GPU: 1
NUM_EPOCHS: 40
OPTIMIZER: adam_onecycle
LR: 0.001
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 10
Very sorry for the late reply, I'm rushing some ddls. We haven't tried kitti dataset. You can see if issue59 will be helpful.
Yes, I checked this issue so I recalculate the voxel size. The sparse shape matched with pillar settings but the training will throw Nan or Inf error after multiple epochs
很抱歉回复晚了,我正在赶一些ddls。我们还没有尝试过kitti数据集。您可以查看 issue59 是否有帮助。
是的,我检查了这个问题,所以我重新计算了体素大小。稀疏形状与柱子设置匹配,但训练会在多个 epoch 后抛出 Nan 或 Inf 错误 May I ask if you can adapt to the Kitti dataset by simply modifying the config file without modifying the network? Besides, why is your backbone_ 3D Don't need downsampling
If anyone has succeeded on KITTI by modifying the config, please share the corresponding config and experimental results in this issue. We will be very grateful for your contribution to the community. :)
After I finish the CVPR deadline, I'll take a look when I have time. I guess this shouldn't be a very difficult problem.
很抱歉回复晚了,我正在赶一些ddls。我们还没有尝试过kitti数据集。您可以查看 issue59 是否有帮助。
是的,我检查了这个问题,所以我重新计算了体素大小。稀疏形状与柱子设置匹配,但训练会在多个 epoch 后抛出 Nan 或 Inf 错误 May I ask if you can adapt to the Kitti dataset by simply modifying the config file without modifying the network? Besides, why is your backbone_ 3D Don't need downsampling
I guess he use DSVT-pillar version.
Very sorry for the late reply, I'm rushing some ddls. We haven't tried kitti dataset. You can see if issue59 will be helpful.
Yes, I checked this issue so I recalculate the voxel size. The sparse shape matched with pillar settings but the training will throw Nan or Inf error after multiple epochs
did you complete kitti config? l really need this,thanks!
Any update?
I have a functional config for training kitti dataset:
CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']
DATA_CONFIG:
_BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml
POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True
- NAME: shuffle_points
SHUFFLE_ENABLED: {
'train': True,
'test': False
}
- NAME: transform_points_to_voxels_placeholder
VOXEL_SIZE: [0.1477, 0.1696, 4]
MAX_POINTS_PER_VOXEL: 32
MAX_NUMBER_OF_VOXELS: {
'train': 16000,
'test': 40000
}
DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['placeholder']
AUG_CONFIG_LIST:
- NAME: gt_sampling
USE_ROAD_PLANE: True
DB_INFO_PATH:
- kitti_dbinfos_train.pkl
PREPARE: {
filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
filter_by_difficulty: [-1],
}
SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
NUM_POINT_FEATURES: 4
DATABASE_WITH_FAKELIDAR: False
REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
LIMIT_WHOLE_SCENE: False
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x']
- NAME: random_world_rotation
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
- NAME: random_world_scaling
WORLD_SCALE_RANGE: [0.95, 1.05]
- NAME: random_local_pyramid_aug
DROP_PROB: 0.25
SPARSIFY_PROB: 0.05
SPARSIFY_MAX_NUM: 50
SWAP_PROB: 0.1
SWAP_MAX_NUM: 50
MODEL:
NAME: CenterPoint
VFE:
NAME: DynPillarVFE3D
WITH_DISTANCE: False
USE_ABSLOTE_XYZ: True
USE_NORM: True
NUM_FILTERS: [ 192, 192 ]
BACKBONE_3D:
NAME: DSVT
INPUT_LAYER:
sparse_shape: [468, 468, 1]
downsample_stride: []
d_model: [192]
set_info: [[36, 4]]
window_shape: [[12, 12, 1]]
hybrid_factor: [2, 2, 1] # x, y, z
shifts_list: [[[0, 0, 0], [6, 6, 0]]]
normalize_pos: False
block_name: ['DSVTBlock']
set_info: [[36, 4]]
d_model: [192]
nhead: [8]
dim_feedforward: [384]
dropout: 0.0
activation: gelu
output_shape: [468, 468]
conv_out_channel: 192
ues_checkpoint: True
MAP_TO_BEV:
NAME: PointPillarScatter3d
INPUT_SHAPE: [468, 468, 1]
NUM_BEV_FEATURES: 192
BACKBONE_2D:
NAME: BaseBEVResBackbone
LAYER_NUMS: [ 1, 2, 2 ]
LAYER_STRIDES: [ 1, 2, 2 ]
NUM_FILTERS: [ 128, 128, 256 ]
UPSAMPLE_STRIDES: [ 1, 2, 4 ]
NUM_UPSAMPLE_FILTERS: [ 128, 128, 128 ]
DENSE_HEAD:
NAME: CenterHead
CLASS_AGNOSTIC: False
CLASS_NAMES_EACH_HEAD: [
['Car', 'Pedestrian', 'Cyclist']
]
SHARED_CONV_CHANNEL: 64
USE_BIAS_BEFORE_NORM: False
NUM_HM_CONV: 2
BN_EPS: 0.001
BN_MOM: 0.01
SEPARATE_HEAD_CFG:
HEAD_ORDER: ['center', 'center_z', 'dim', 'rot']
HEAD_DICT: {
'center': {'out_channels': 2, 'num_conv': 2},
'center_z': {'out_channels': 1, 'num_conv': 2},
'dim': {'out_channels': 3, 'num_conv': 2},
'rot': {'out_channels': 2, 'num_conv': 2},
'iou': {'out_channels': 1, 'num_conv': 2},
}
TARGET_ASSIGNER_CONFIG:
FEATURE_MAP_STRIDE: 1
NUM_MAX_OBJS: 500
GAUSSIAN_OVERLAP: 0.1
MIN_RADIUS: 2
# BOX_CODER: ResidualCoder
IOU_REG_LOSS: True
LOSS_CONFIG:
LOSS_WEIGHTS: {
'cls_weight': 1.0,
'loc_weight': 2.0,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
POST_PROCESSING:
# RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
SCORE_THRESH: 0.1
OUTPUT_RAW_SCORE: False
POST_CENTER_LIMIT_RANGE: [0, -40, -3, 80, 40, 1]
MAX_OBJ_PER_SAMPLE: 500
# USE_IOU_TO_RECTIFY_SCORE: True
# IOU_RECTIFIER: [0.5, 0.71, 0.65]
NMS_CONFIG:
MULTI_CLASSES_NMS: False
NMS_TYPE: nms_gpu
NMS_THRESH: 0.01
NMS_PRE_MAXSIZE: 4096
NMS_POST_MAXSIZE: 500
POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
EVAL_METRIC: kitti
OPTIMIZATION:
BATCH_SIZE_PER_GPU: 2
NUM_EPOCHS: 80
OPTIMIZER: adam_onecycle
LR: 0.001
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 10
LOSS_SCALE_FP16: 32.0
And I got results below:
Generate label finished(sec_per_example: 0.0629 second).
recall_roi_0.3: 0.000000
recall_rcnn_0.3: 0.939800
recall_roi_0.5: 0.000000
recall_rcnn_0.5: 0.888598
recall_roi_0.7: 0.000000
recall_rcnn_0.7: 0.669268
Average predicted number of objects(3769 samples): 12.283
Car AP@0.70, 0.70, 0.70:
bbox AP:95.2198, 89.4526, 88.8744
bev AP:89.3524, 87.4650, 86.4434
3d AP:87.1649, 77.6970, 76.9884
aos AP:95.20, 89.33, 88.69
Car AP_R40@0.70, 0.70, 0.70:
bbox AP:97.0173, 94.1119, 91.8572
bev AP:92.0481, 88.3356, 87.8141
3d AP:87.8954, 80.9305, 78.6656
aos AP:97.00, 93.96, 91.65
Car AP@0.70, 0.50, 0.50:
bbox AP:95.2198, 89.4526, 88.8744
bev AP:95.2554, 89.6240, 89.1787
3d AP:95.1996, 89.5775, 89.0910
aos AP:95.20, 89.33, 88.69
Car AP_R40@0.70, 0.50, 0.50:
bbox AP:97.0173, 94.1119, 91.8572
bev AP:97.2790, 94.6162, 94.2047
3d AP:97.2439, 94.5121, 94.0162
aos AP:97.00, 93.96, 91.65
Pedestrian AP@0.50, 0.50, 0.50:
bbox AP:68.9796, 66.8907, 64.9734
bev AP:58.0059, 55.0643, 52.5829
3d AP:52.8691, 51.4204, 47.9159
aos AP:64.75, 62.16, 59.99
Pedestrian AP_R40@0.50, 0.50, 0.50:
bbox AP:69.8867, 67.2526, 64.8893
bev AP:56.3069, 53.6334, 50.8015
3d AP:51.9532, 49.1637, 45.9747
aos AP:65.05, 62.00, 59.43
Pedestrian AP@0.50, 0.25, 0.25:
bbox AP:68.9796, 66.8907, 64.9734
bev AP:75.4927, 73.8959, 71.9835
3d AP:74.6756, 72.9769, 71.1328
aos AP:64.75, 62.16, 59.99
Pedestrian AP_R40@0.50, 0.25, 0.25:
bbox AP:69.8867, 67.2526, 64.8893
bev AP:76.3137, 74.6665, 72.3919
3d AP:75.3678, 73.5785, 71.5206
aos AP:65.05, 62.00, 59.43
Cyclist AP@0.50, 0.50, 0.50:
bbox AP:88.9667, 77.5716, 74.3384
bev AP:86.9765, 71.5262, 67.4459
3d AP:85.9338, 69.3215, 66.2503
aos AP:88.85, 77.08, 73.75
Cyclist AP_R40@0.50, 0.50, 0.50:
bbox AP:93.4071, 78.7487, 75.0949
bev AP:91.3305, 71.9404, 67.9715
3d AP:88.3232, 69.5222, 66.1419
aos AP:93.27, 78.19, 74.49
Cyclist AP@0.50, 0.25, 0.25:
bbox AP:88.9667, 77.5716, 74.3384
bev AP:87.2510, 74.5043, 70.9506
3d AP:87.2510, 74.5037, 70.9506
aos AP:88.85, 77.08, 73.75
Cyclist AP_R40@0.50, 0.25, 0.25:
bbox AP:93.4071, 78.7487, 75.0949
bev AP:91.5098, 75.4892, 71.6936
3d AP:91.5098, 75.4891, 71.6934
aos AP:93.27, 78.19, 74.49
Thanks for your contribution! Very Nice!
But I am not familiar with KiTTi, may I ask if this performance is acceptable? Thanks! Looking forward your reply.
If this result turns out to be good, I will tag this issue to make it more accessible for those interested in running DSVT on KITTI. Many thanks!
I adopted the pointpillar settings and it has a slightly better performance compared to the pointpillar. I trained the model with a sinlgle GPU so the performance might be furtherly improved with multi-gpu training I guess.
I adopted the pointpillar settings and it has a slightly better performance compared to the pointpillar. I trained the model with a sinlgle GPU so the performance might be furtherly improved with multi-gpu training I guess.
Perhaps some further adjustments can be made; DSVT performs much better on Waymo and NuScenes compared to PointPillar. At least, its performance on KITTI should be close to that of MsSVT.
Thank @dinvincible98 , it seems that this issue has been resolved to some extent. The issue will be closed.
Thank you all for your contributions and discussions. :)
作者你好,我成功配置了环境以及训练了kitti的数据,但是在转onnx模型时遇到了点问题,请问这个需要填写的是我生成数据集的pkl文件吗?在deploy.py中的path,我生成的文件是pkl ####### read input ####### batch_dict = torch.load("path to batch_dict.pth", map_location="cuda") inputs = batch_dict
作者你好,我成功配置了环境以及训练了kitti的数据,但是在转onnx模型时遇到了点问题,请问这个需要填写的是我生成数据集的pkl文件吗?在deploy.py中的path,我生成的文件是pkl ####### read input ####### batch_dict = torch.load("path to batch_dict.pth", map_location="cuda") inputs = batch_dict
我在readme里面找到了inputdict.pth的下载地址,载入我基于kitii训练的权重,但是显示的报错是
File "deploy.py", line 134, in
作者你好,我成功配置了环境以及训练了kitti的数据,但是在转onnx模型时遇到了点问题,请问这个需要填写的是我生成数据集的pkl文件吗?在deploy.py中的path,我生成的文件是pkl ####### read input ####### batch_dict = torch.load("path to batch_dict.pth", map_location="cuda") inputs = batch_dict
我在readme里面找到了inputdict.pth的下载地址,载入我基于kitii训练的权重,但是显示的报错是 File "deploy.py", line 134, in inputs = model.vfe(inputs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 219, in forward features = pfn(features, unq_inv) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 37, in forward x = self.linear(inputs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (61800x11 and 10x96)
作者你好,我成功配置了环境以及训练了kitti的数据,但是在转onnx模型时遇到了点问题,请问这个需要填写的是我生成数据集的pkl文件吗?在deploy.py中的path,我生成的文件是pkl ####### read input ####### batch_dict = torch.load("path to batch_dict.pth", map_location="cuda") inputs = batch_dict
我在readme里面找到了inputdict.pth的下载地址,载入我基于kitii训练的权重,但是显示的报错是 File "deploy.py", line 134, in inputs = model.vfe(inputs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 219, in forward features = pfn(features, unq_inv) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 37, in forward x = self.linear(inputs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (61800x11 and 10x96)
我这边查到问题了,提供的点云是有6个参数,而kitti数据只有5个,所以需要去掉最后一个维度就会可以了
Hi,
I tried to train a dsvt-pillar model using the kitti dataset, below is my config:
I only modified the point cloud range to match with the kitti settings and the voxel size to match with the default sparce shape [468, 468, 1], but I am constantly getting an error:
I traced down the error happened in DynamicPillarVFE3D module where the batch_dict['points'] often return some empty tensor point. However, when I tried to use the default point cloud range from waymo settings: [-74.88, -74.88, -2, 74.88, 74.88, 4.0], this error disappered. Can u give me some guidance?
Thank you!