Open abadithela opened 7 months ago
Running the following command from command line also doesn't give the right answer:
python tools/test.py projects/BEVFusion/configs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py checkpoints/bevfusion_converted.pth --cfg-options "test_evaluator.pklfile_prefix=/home/apurvabadithela/nuscenes_dataset/inference_results/bevfusion_model/results.pkl" --task 'multi-modality_det'
Instead, I get the following error:
04/22 12:07:56 - mmengine - INFO -
------------------------------------------------------------
System environment:
sys.platform: linux
Python: 3.8.16 (default, Jan 17 2023, 23:13:24) [GCC 11.2.0]
CUDA available: True
numpy_random_seed: 2089837194
GPU 0,1: NVIDIA RTX A6000
CUDA_HOME: /home/apurvabadithela/miniconda3/envs/detection
NVCC: Cuda compilation tools, release 11.7, V11.7.99
GCC: gcc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0
PyTorch: 1.13.1
PyTorch compiling details: PyTorch built with:
- GCC 9.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.7
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.5
- Magma 2.6.1
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.14.1
OpenCV: 4.7.0
MMEngine: 0.9.1
Runtime environment:
cudnn_benchmark: False
mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
dist_cfg: {'backend': 'nccl'}
seed: 2089837194
Distributed launcher: none
Distributed training: False
GPU number: 1
------------------------------------------------------------
04/22 12:07:56 - mmengine - INFO - Config:
auto_scale_lr = dict(base_batch_size=32, enable=False)
backend_args = None
class_names = [
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
]
custom_imports = dict(
allow_failed_imports=False, imports=[
'projects.BEVFusion.bevfusion',
])
data_prefix = dict(
CAM_BACK='samples/CAM_BACK',
CAM_BACK_LEFT='samples/CAM_BACK_LEFT',
CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT',
CAM_FRONT='samples/CAM_FRONT',
CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT',
CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT',
pts='samples/LIDAR_TOP',
sweeps='sweeps/LIDAR_TOP')
data_root = 'data/nuscenes/'
dataset_type = 'NuScenesDataset'
db_sampler = dict(
classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
],
data_root='data/nuscenes/',
info_path='data/nuscenes/nuscenes_dbinfos_train.pkl',
points_loader=dict(
backend_args=None,
coord_type='LIDAR',
load_dim=5,
type='LoadPointsFromFile',
use_dim=[
0,
1,
2,
3,
4,
]),
prepare=dict(
filter_by_difficulty=[
-1,
],
filter_by_min_points=dict(
barrier=5,
bicycle=5,
bus=5,
car=5,
construction_vehicle=5,
motorcycle=5,
pedestrian=5,
traffic_cone=5,
trailer=5,
truck=5)),
rate=1.0,
sample_groups=dict(
barrier=2,
bicycle=6,
bus=4,
car=2,
construction_vehicle=7,
motorcycle=6,
pedestrian=2,
traffic_cone=2,
trailer=6,
truck=3))
default_hooks = dict(
checkpoint=dict(interval=1, type='CheckpointHook'),
logger=dict(interval=50, type='LoggerHook'),
param_scheduler=dict(type='ParamSchedulerHook'),
sampler_seed=dict(type='DistSamplerSeedHook'),
timer=dict(type='IterTimerHook'),
visualization=dict(type='Det3DVisualizationHook'))
default_scope = 'mmdet3d'
env_cfg = dict(
cudnn_benchmark=False,
dist_cfg=dict(backend='nccl'),
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
input_modality = dict(use_camera=True, use_lidar=True)
launcher = 'none'
load_from = 'checkpoints/bevfusion_converted.pth'
log_level = 'INFO'
log_processor = dict(by_epoch=True, type='LogProcessor', window_size=50)
lr = 0.0001
metainfo = dict(classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
])
model = dict(
bbox_head=dict(
auxiliary=True,
bbox_coder=dict(
code_size=10,
out_size_factor=8,
pc_range=[
-54.0,
-54.0,
],
post_center_range=[
-61.2,
-61.2,
-10.0,
61.2,
61.2,
10.0,
],
score_threshold=0.0,
type='TransFusionBBoxCoder',
voxel_size=[
0.075,
0.075,
]),
bn_momentum=0.1,
common_heads=dict(
center=[
2,
2,
],
dim=[
3,
2,
],
height=[
1,
2,
],
rot=[
2,
2,
],
vel=[
2,
2,
]),
decoder_layer=dict(
cross_attn_cfg=dict(dropout=0.1, embed_dims=128, num_heads=8),
ffn_cfg=dict(
act_cfg=dict(inplace=True, type='ReLU'),
embed_dims=128,
feedforward_channels=256,
ffn_drop=0.1,
num_fcs=2),
norm_cfg=dict(type='LN'),
pos_encoding_cfg=dict(input_channel=2, num_pos_feats=128),
self_attn_cfg=dict(dropout=0.1, embed_dims=128, num_heads=8),
type='TransformerDecoderLayer'),
hidden_channel=128,
in_channels=512,
loss_bbox=dict(
loss_weight=0.25, reduction='mean', type='mmdet.L1Loss'),
loss_cls=dict(
alpha=0.25,
gamma=2.0,
loss_weight=1.0,
reduction='mean',
type='mmdet.FocalLoss',
use_sigmoid=True),
loss_heatmap=dict(
loss_weight=1.0, reduction='mean', type='mmdet.GaussianFocalLoss'),
nms_kernel_size=3,
num_classes=10,
num_decoder_layers=1,
num_proposals=200,
test_cfg=dict(
dataset='nuScenes',
grid_size=[
1440,
1440,
41,
],
nms_type=None,
out_size_factor=8,
pc_range=[
-54.0,
-54.0,
],
voxel_size=[
0.075,
0.075,
]),
train_cfg=dict(
assigner=dict(
cls_cost=dict(
alpha=0.25,
gamma=2.0,
type='mmdet.FocalLossCost',
weight=0.15),
iou_calculator=dict(coordinate='lidar', type='BboxOverlaps3D'),
iou_cost=dict(type='IoU3DCost', weight=0.25),
reg_cost=dict(type='BBoxBEVL1Cost', weight=0.25),
type='HungarianAssigner3D'),
code_weights=[
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
0.2,
0.2,
],
dataset='nuScenes',
gaussian_overlap=0.1,
grid_size=[
1440,
1440,
41,
],
min_radius=2,
out_size_factor=8,
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
pos_weight=-1,
voxel_size=[
0.075,
0.075,
0.2,
]),
type='TransFusionHead'),
data_preprocessor=dict(
bgr_to_rgb=False,
mean=[
123.675,
116.28,
103.53,
],
pad_size_divisor=32,
std=[
58.395,
57.12,
57.375,
],
type='Det3DDataPreprocessor',
voxelize_cfg=dict(
max_num_points=10,
max_voxels=[
120000,
160000,
],
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
voxel_size=[
0.075,
0.075,
0.2,
],
voxelize_reduce=True)),
fusion_layer=dict(
in_channels=[
80,
256,
], out_channels=256, type='ConvFuser'),
img_backbone=dict(
attn_drop_rate=0.0,
convert_weights=True,
depths=[
2,
2,
6,
2,
],
drop_path_rate=0.2,
drop_rate=0.0,
embed_dims=96,
init_cfg=dict(
checkpoint=
'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth',
type='Pretrained'),
mlp_ratio=4,
num_heads=[
3,
6,
12,
24,
],
out_indices=[
1,
2,
3,
],
patch_norm=True,
qk_scale=None,
qkv_bias=True,
type='mmdet.SwinTransformer',
window_size=7,
with_cp=False),
img_neck=dict(
act_cfg=dict(inplace=True, type='ReLU'),
in_channels=[
192,
384,
768,
],
norm_cfg=dict(requires_grad=True, type='BN2d'),
num_outs=3,
out_channels=256,
start_level=0,
type='GeneralizedLSSFPN',
upsample_cfg=dict(align_corners=False, mode='bilinear')),
pts_backbone=dict(
conv_cfg=dict(bias=False, type='Conv2d'),
in_channels=256,
layer_nums=[
5,
5,
],
layer_strides=[
1,
2,
],
norm_cfg=dict(eps=0.001, momentum=0.01, type='BN'),
out_channels=[
128,
256,
],
type='SECOND'),
pts_middle_encoder=dict(
block_type='basicblock',
encoder_channels=(
(
16,
16,
32,
),
(
32,
32,
64,
),
(
64,
64,
128,
),
(
128,
128,
),
),
encoder_paddings=(
(
0,
0,
1,
),
(
0,
0,
1,
),
(
0,
0,
(
1,
1,
0,
),
),
(
0,
0,
),
),
in_channels=5,
norm_cfg=dict(eps=0.001, momentum=0.01, type='BN1d'),
order=(
'conv',
'norm',
'act',
),
sparse_shape=[
1440,
1440,
41,
],
type='BEVFusionSparseEncoder'),
pts_neck=dict(
in_channels=[
128,
256,
],
norm_cfg=dict(eps=0.001, momentum=0.01, type='BN'),
out_channels=[
256,
256,
],
type='SECONDFPN',
upsample_cfg=dict(bias=False, type='deconv'),
upsample_strides=[
1,
2,
],
use_conv_for_no_stride=True),
pts_voxel_encoder=dict(num_features=5, type='HardSimpleVFE'),
type='BEVFusion',
view_transform=dict(
dbound=[
1.0,
60.0,
0.5,
],
downsample=2,
feature_size=[
32,
88,
],
image_size=[
256,
704,
],
in_channels=256,
out_channels=80,
type='DepthLSSTransform',
xbound=[
-54.0,
54.0,
0.3,
],
ybound=[
-54.0,
54.0,
0.3,
],
zbound=[
-10.0,
10.0,
20.0,
]))
optim_wrapper = dict(
clip_grad=dict(max_norm=35, norm_type=2),
optimizer=dict(lr=0.0002, type='AdamW', weight_decay=0.01),
type='OptimWrapper')
param_scheduler = [
dict(
begin=0,
by_epoch=False,
end=500,
start_factor=0.33333333,
type='LinearLR'),
dict(
T_max=6,
begin=0,
by_epoch=True,
convert_to_iter_based=True,
end=6,
eta_min_ratio=0.0001,
type='CosineAnnealingLR'),
dict(
begin=0,
by_epoch=True,
convert_to_iter_based=True,
end=2.4,
eta_min=0.8947368421052632,
type='CosineAnnealingMomentum'),
dict(
begin=2.4,
by_epoch=True,
convert_to_iter_based=True,
end=6,
eta_min=1,
type='CosineAnnealingMomentum'),
]
point_cloud_range = [
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
]
resume = False
test_cfg = dict()
test_dataloader = dict(
batch_size=1,
dataset=dict(
ann_file='nuscenes_infos_val.pkl',
backend_args=None,
box_type_3d='LiDAR',
data_prefix=dict(
CAM_BACK='samples/CAM_BACK',
CAM_BACK_LEFT='samples/CAM_BACK_LEFT',
CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT',
CAM_FRONT='samples/CAM_FRONT',
CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT',
CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT',
pts='samples/LIDAR_TOP',
sweeps='sweeps/LIDAR_TOP'),
data_root='data/nuscenes/',
metainfo=dict(classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
]),
modality=dict(use_camera=True, use_lidar=True),
pipeline=[
dict(
backend_args=None,
color_type='color',
to_float32=True,
type='BEVLoadMultiViewImageFromFiles'),
dict(
backend_args=None,
coord_type='LIDAR',
load_dim=5,
type='LoadPointsFromFile',
use_dim=5),
dict(
backend_args=None,
load_dim=5,
pad_empty_sweeps=True,
remove_close=True,
sweeps_num=9,
type='LoadPointsFromMultiSweeps',
use_dim=5),
dict(
bot_pct_lim=[
0.0,
0.0,
],
final_dim=[
256,
704,
],
is_train=False,
rand_flip=False,
resize_lim=[
0.48,
0.48,
],
rot_lim=[
0.0,
0.0,
],
type='ImageAug3D'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='PointsRangeFilter'),
dict(
keys=[
'img',
'points',
'gt_bboxes_3d',
'gt_labels_3d',
],
meta_keys=[
'cam2img',
'ori_cam2img',
'lidar2cam',
'lidar2img',
'cam2lidar',
'ori_lidar2img',
'img_aug_matrix',
'box_type_3d',
'sample_idx',
'lidar_path',
'img_path',
'num_pts_feats',
],
type='Pack3DDetInputs'),
],
test_mode=True,
type='NuScenesDataset'),
drop_last=False,
num_workers=4,
persistent_workers=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(
ann_file='data/nuscenes/nuscenes_infos_val.pkl',
backend_args=None,
data_root='data/nuscenes/',
metric='bbox',
pklfile_prefix=
'/home/apurvabadithela/nuscenes_dataset/inference_results/bevfusion_model/results.pkl',
type='NuScenesMetric')
test_pipeline = [
dict(
backend_args=None,
color_type='color',
to_float32=True,
type='BEVLoadMultiViewImageFromFiles'),
dict(
backend_args=None,
coord_type='LIDAR',
load_dim=5,
type='LoadPointsFromFile',
use_dim=5),
dict(
backend_args=None,
load_dim=5,
pad_empty_sweeps=True,
remove_close=True,
sweeps_num=9,
type='LoadPointsFromMultiSweeps',
use_dim=5),
dict(
bot_pct_lim=[
0.0,
0.0,
],
final_dim=[
256,
704,
],
is_train=False,
rand_flip=False,
resize_lim=[
0.48,
0.48,
],
rot_lim=[
0.0,
0.0,
],
type='ImageAug3D'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='PointsRangeFilter'),
dict(
keys=[
'img',
'points',
'gt_bboxes_3d',
'gt_labels_3d',
],
meta_keys=[
'cam2img',
'ori_cam2img',
'lidar2cam',
'lidar2img',
'cam2lidar',
'ori_lidar2img',
'img_aug_matrix',
'box_type_3d',
'sample_idx',
'lidar_path',
'img_path',
'num_pts_feats',
],
type='Pack3DDetInputs'),
]
train_cfg = dict(by_epoch=True, max_epochs=6, val_interval=1)
train_dataloader = dict(
batch_size=4,
dataset=dict(
dataset=dict(
ann_file='nuscenes_infos_train.pkl',
box_type_3d='LiDAR',
data_prefix=dict(
CAM_BACK='samples/CAM_BACK',
CAM_BACK_LEFT='samples/CAM_BACK_LEFT',
CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT',
CAM_FRONT='samples/CAM_FRONT',
CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT',
CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT',
pts='samples/LIDAR_TOP',
sweeps='sweeps/LIDAR_TOP'),
data_root='data/nuscenes/',
metainfo=dict(classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
]),
modality=dict(use_camera=True, use_lidar=True),
pipeline=[
dict(
backend_args=None,
color_type='color',
to_float32=True,
type='BEVLoadMultiViewImageFromFiles'),
dict(
backend_args=None,
coord_type='LIDAR',
load_dim=5,
type='LoadPointsFromFile',
use_dim=5),
dict(
backend_args=None,
load_dim=5,
pad_empty_sweeps=True,
remove_close=True,
sweeps_num=9,
type='LoadPointsFromMultiSweeps',
use_dim=5),
dict(
type='LoadAnnotations3D',
with_attr_label=False,
with_bbox_3d=True,
with_label_3d=True),
dict(
bot_pct_lim=[
0.0,
0.0,
],
final_dim=[
256,
704,
],
is_train=True,
rand_flip=True,
resize_lim=[
0.38,
0.55,
],
rot_lim=[
-5.4,
5.4,
],
type='ImageAug3D'),
dict(
rot_range=[
-0.78539816,
0.78539816,
],
scale_ratio_range=[
0.9,
1.1,
],
translation_std=0.5,
type='BEVFusionGlobalRotScaleTrans'),
dict(type='BEVFusionRandomFlip3D'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='PointsRangeFilter'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='ObjectRangeFilter'),
dict(
classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
],
type='ObjectNameFilter'),
dict(
fixed_prob=True,
max_epoch=6,
mode=1,
offset=False,
prob=0.0,
ratio=0.5,
rotate=1,
type='GridMask',
use_h=True,
use_w=True),
dict(type='PointShuffle'),
dict(
keys=[
'points',
'img',
'gt_bboxes_3d',
'gt_labels_3d',
'gt_bboxes',
'gt_labels',
],
meta_keys=[
'cam2img',
'ori_cam2img',
'lidar2cam',
'lidar2img',
'cam2lidar',
'ori_lidar2img',
'img_aug_matrix',
'box_type_3d',
'sample_idx',
'lidar_path',
'img_path',
'transformation_3d_flow',
'pcd_rotation',
'pcd_scale_factor',
'pcd_trans',
'img_aug_matrix',
'lidar_aug_matrix',
'num_pts_feats',
],
type='Pack3DDetInputs'),
],
test_mode=False,
type='NuScenesDataset',
use_valid_flag=True),
type='CBGSDataset'),
num_workers=4,
persistent_workers=True,
sampler=dict(shuffle=True, type='DefaultSampler'))
train_pipeline = [
dict(
backend_args=None,
color_type='color',
to_float32=True,
type='BEVLoadMultiViewImageFromFiles'),
dict(
backend_args=None,
coord_type='LIDAR',
load_dim=5,
type='LoadPointsFromFile',
use_dim=5),
dict(
backend_args=None,
load_dim=5,
pad_empty_sweeps=True,
remove_close=True,
sweeps_num=9,
type='LoadPointsFromMultiSweeps',
use_dim=5),
dict(
type='LoadAnnotations3D',
with_attr_label=False,
with_bbox_3d=True,
with_label_3d=True),
dict(
bot_pct_lim=[
0.0,
0.0,
],
final_dim=[
256,
704,
],
is_train=True,
rand_flip=True,
resize_lim=[
0.38,
0.55,
],
rot_lim=[
-5.4,
5.4,
],
type='ImageAug3D'),
dict(
rot_range=[
-0.78539816,
0.78539816,
],
scale_ratio_range=[
0.9,
1.1,
],
translation_std=0.5,
type='BEVFusionGlobalRotScaleTrans'),
dict(type='BEVFusionRandomFlip3D'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='PointsRangeFilter'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='ObjectRangeFilter'),
dict(
classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
],
type='ObjectNameFilter'),
dict(
fixed_prob=True,
max_epoch=6,
mode=1,
offset=False,
prob=0.0,
ratio=0.5,
rotate=1,
type='GridMask',
use_h=True,
use_w=True),
dict(type='PointShuffle'),
dict(
keys=[
'points',
'img',
'gt_bboxes_3d',
'gt_labels_3d',
'gt_bboxes',
'gt_labels',
],
meta_keys=[
'cam2img',
'ori_cam2img',
'lidar2cam',
'lidar2img',
'cam2lidar',
'ori_lidar2img',
'img_aug_matrix',
'box_type_3d',
'sample_idx',
'lidar_path',
'img_path',
'transformation_3d_flow',
'pcd_rotation',
'pcd_scale_factor',
'pcd_trans',
'img_aug_matrix',
'lidar_aug_matrix',
'num_pts_feats',
],
type='Pack3DDetInputs'),
]
val_cfg = dict()
val_dataloader = dict(
batch_size=1,
dataset=dict(
ann_file='nuscenes_infos_val.pkl',
backend_args=None,
box_type_3d='LiDAR',
data_prefix=dict(
CAM_BACK='samples/CAM_BACK',
CAM_BACK_LEFT='samples/CAM_BACK_LEFT',
CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT',
CAM_FRONT='samples/CAM_FRONT',
CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT',
CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT',
pts='samples/LIDAR_TOP',
sweeps='sweeps/LIDAR_TOP'),
data_root='data/nuscenes/',
metainfo=dict(classes=[
'car',
'truck',
'construction_vehicle',
'bus',
'trailer',
'barrier',
'motorcycle',
'bicycle',
'pedestrian',
'traffic_cone',
]),
modality=dict(use_camera=True, use_lidar=True),
pipeline=[
dict(
backend_args=None,
color_type='color',
to_float32=True,
type='BEVLoadMultiViewImageFromFiles'),
dict(
backend_args=None,
coord_type='LIDAR',
load_dim=5,
type='LoadPointsFromFile',
use_dim=5),
dict(
backend_args=None,
load_dim=5,
pad_empty_sweeps=True,
remove_close=True,
sweeps_num=9,
type='LoadPointsFromMultiSweeps',
use_dim=5),
dict(
bot_pct_lim=[
0.0,
0.0,
],
final_dim=[
256,
704,
],
is_train=False,
rand_flip=False,
resize_lim=[
0.48,
0.48,
],
rot_lim=[
0.0,
0.0,
],
type='ImageAug3D'),
dict(
point_cloud_range=[
-54.0,
-54.0,
-5.0,
54.0,
54.0,
3.0,
],
type='PointsRangeFilter'),
dict(
keys=[
'img',
'points',
'gt_bboxes_3d',
'gt_labels_3d',
],
meta_keys=[
'cam2img',
'ori_cam2img',
'lidar2cam',
'lidar2img',
'cam2lidar',
'ori_lidar2img',
'img_aug_matrix',
'box_type_3d',
'sample_idx',
'lidar_path',
'img_path',
'num_pts_feats',
],
type='Pack3DDetInputs'),
],
test_mode=True,
type='NuScenesDataset'),
drop_last=False,
num_workers=4,
persistent_workers=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(
ann_file='data/nuscenes/nuscenes_infos_val.pkl',
backend_args=None,
data_root='data/nuscenes/',
metric='bbox',
type='NuScenesMetric')
vis_backends = [
dict(type='LocalVisBackend'),
]
visualizer = dict(
name='visualizer',
type='Det3DLocalVisualizer',
vis_backends=[
dict(type='LocalVisBackend'),
])
voxel_size = [
0.075,
0.075,
0.2,
]
work_dir = './work_dirs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d'
04/22 12:07:58 - mmengine - INFO - Loads checkpoint by http backend from path: https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
04/22 12:08:02 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.
04/22 12:08:02 - mmengine - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) RuntimeInfoHook
(BELOW_NORMAL) LoggerHook
--------------------
before_train:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(VERY_LOW ) CheckpointHook
--------------------
before_train_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(NORMAL ) DistSamplerSeedHook
--------------------
before_train_iter:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
--------------------
after_train_iter:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook
--------------------
after_train_epoch:
(NORMAL ) IterTimerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook
--------------------
before_val:
(VERY_HIGH ) RuntimeInfoHook
--------------------
before_val_epoch:
(NORMAL ) IterTimerHook
--------------------
before_val_iter:
(NORMAL ) IterTimerHook
--------------------
after_val_iter:
(NORMAL ) IterTimerHook
(NORMAL ) Det3DVisualizationHook
(BELOW_NORMAL) LoggerHook
--------------------
after_val_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook
--------------------
after_val:
(VERY_HIGH ) RuntimeInfoHook
--------------------
after_train:
(VERY_HIGH ) RuntimeInfoHook
(VERY_LOW ) CheckpointHook
--------------------
before_test:
(VERY_HIGH ) RuntimeInfoHook
--------------------
before_test_epoch:
(NORMAL ) IterTimerHook
--------------------
before_test_iter:
(NORMAL ) IterTimerHook
--------------------
after_test_iter:
(NORMAL ) IterTimerHook
(NORMAL ) Det3DVisualizationHook
(BELOW_NORMAL) LoggerHook
--------------------
after_test_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
--------------------
after_test:
(VERY_HIGH ) RuntimeInfoHook
--------------------
after_run:
(BELOW_NORMAL) LoggerHook
--------------------
04/22 12:08:10 - mmengine - INFO - ------------------------------
04/22 12:08:10 - mmengine - INFO - The length of test dataset: 6019
04/22 12:08:10 - mmengine - INFO - The number of instances per category in the dataset:
+----------------------+--------+
| category | number |
+----------------------+--------+
| car | 80004 |
| truck | 15704 |
| construction_vehicle | 2678 |
| bus | 3158 |
| trailer | 4159 |
| barrier | 26992 |
| motorcycle | 2508 |
| bicycle | 2381 |
| pedestrian | 34347 |
| traffic_cone | 15597 |
+----------------------+--------+
/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmdet/models/task_modules/builder.py:17: UserWarning: ``build_sampler`` would be deprecated soon, please use ``mmdet.registry.TASK_UTILS.build()``
warnings.warn('``build_sampler`` would be deprecated soon, please use '
/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmdet/models/task_modules/builder.py:39: UserWarning: ``build_assigner`` would be deprecated soon, please use ``mmdet.registry.TASK_UTILS.build()``
warnings.warn('``build_assigner`` would be deprecated soon, please use '
/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1670525541702/work/aten/src/ATen/native/TensorShape.cpp:3190.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Traceback (most recent call last):
File "tools/test.py", line 149, in <module>
main()
File "tools/test.py", line 145, in main
runner.test()
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1816, in test
self._test_loop = self.build_test_loop(self._test_loop) # type: ignore
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1611, in build_test_loop
loop = TestLoop(
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/runner/loops.py", line 413, in __init__
self.evaluator = runner.build_evaluator(evaluator) # type: ignore
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1318, in build_evaluator
return Evaluator(evaluator) # type: ignore
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/evaluator/evaluator.py", line 25, in __init__
self.metrics.append(METRICS.build(metric))
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/apurvabadithela/miniconda3/envs/detection/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
TypeError: __init__() got an unexpected keyword argument 'pklfile_prefix'
I am wondering what would happen if you just try without .pkl file with below command, does it saves results in some format?
python tools/test.py projects/BEVFusion/configs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py checkpoints/bevfusion_converted.pth --task 'multi-modality_det'
@VeeranjaneyuluToka Yes, I've tried that, but it does not save individual prediction boxes --- it just creates a .json file with the standard metrics and a data/ folder with visualizations of the bounding boxes. I need the predicted boxes for my analysis.
I have created my own inference runner based on their demo samples (https://github.com/open-mmlab/mmdetection3d/tree/main/demo), there is a way to visualize and dump the predictions, however i am working on LiDAR based 3D detection only. But it should work even in multi-modality case also i believe, so i would recommend to look into it.
Hi @VeeranjaneyuluToka: we did the same for just Lidar 3D detector. But based on the demos, doing this for multi-modality did not work. If you look at the multi-modality demo, it requires each point cloud and all associated images for that sample to be in one folder. I'm not sure how to scale this up and run inference for the entire dataset, especially with BEVFusion.
Hey @abadithela , did you come up with any ideas/solutions for the regarding issue?
Prerequisite
Task
I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.
Branch
main branch https://github.com/open-mmlab/mmdetection3d
Environment
sys.platform: linux Python: 3.8.16 (default, Jan 17 2023, 23:13:24) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0,1: NVIDIA RTX A6000 CUDA_HOME: /home/apurvabadithela/miniconda3/envs/detection NVCC: Cuda compilation tools, release 11.7, V11.7.99 GCC: gcc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0 PyTorch: 1.13.1 PyTorch compiling details: PyTorch built with:
TorchVision: 0.14.1 OpenCV: 4.7.0 MMEngine: 0.9.1 MMDetection: 3.2.0 MMDetection3D: 1.4.0+fe25f7a spconv2.0: True
Reproduces the problem - code sample
I want to save prediction results from running the project mmdet3d/projects/BEVFusion. The documentation states to add the tag pklfile_prefix to the test_evaluator, which I do in the config file: config_file by adding the following line after:
test_evaluator.update({'pklfile_prefix':'/home/apurvabadithela/nuscenes_dataset/inference_results/bevfusion_model/results.pkl'})
Reproduces the problem - command or script
Then, I run the following from command line:
Reproduces the problem - error message
And I get the following error message.
Additional information