[Bug] Saving results using a pretrained PETR model

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0] CUDA available: False MUSA available: False numpy_random_seed: 2147483648 GCC: n/a PyTorch: 2.1.0 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.0, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.16.0 OpenCV: 4.9.0 MMEngine: 0.10.4 MMDetection: 3.3.0 MMDetection3D: 1.4.0+962f093 spconv2.0: False

Reproduces the problem - code sample

Link to the code that produces the error

Reproduces the problem - command or script

python tools/test.py projects/PETR/configs/petr_vovnet_gridmask_p4_800x320.py checkpoints/petr_vovnet_gridmask_p4_800x320-e2191752.pth --show --show-dir results --task multi-view_det

Reproduces the problem - error message

python tools/test.py projects/PETR/configs/petr_vovnet_gridmask_p4_800x320.py checkpoints/petr_vovnet_gridmask_p4_800x320-e2191752.pth --show --show-dir results --task multi-view_det /bin/sh: 1: gcc: not found 05/01 01:05:09 - mmengine - INFO -

System environment: sys.platform: linux Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0] CUDA available: False MUSA available: False numpy_random_seed: 1 GCC: n/a PyTorch: 2.1.0 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.0, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.16.0 OpenCV: 4.9.0 MMEngine: 0.10.4

Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 1 deterministic: False diff_rank_seed: False Distributed launcher: none Distributed training: False GPU number: 1

05/01 01:05:10 - mmengine - INFO - Config: auto_scale_lr = dict(base_batch_size=32, enable=False) backbone_norm_cfg = dict(requires_grad=True, type='LN') backend_args = None class_names = [ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ] custom_imports = dict(imports=[ 'projects.PETR.petr', ]) data_prefix = dict(img='', pts='samples/LIDAR_TOP', sweeps='sweeps/LIDAR_TOP') data_root = 'data/nuscenes/' dataset_type = 'NuScenesDataset' db_sampler = dict( backend_args=None, classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ], data_root='data/nuscenes/', info_path='data/nuscenes/nuscenes_dbinfos_train.pkl', points_loader=dict( backend_args=None, coord_type='LIDAR', load_dim=5, type='LoadPointsFromFile', use_dim=[ 0, 1, 2, 3, 4, ]), prepare=dict( filter_by_difficulty=[ -1, ], filter_by_min_points=dict( barrier=5, bicycle=5, bus=5, car=5, construction_vehicle=5, motorcycle=5, pedestrian=5, traffic_cone=5, trailer=5, truck=5)), rate=1.0, sample_groups=dict( barrier=2, bicycle=6, bus=4, car=2, construction_vehicle=7, motorcycle=6, pedestrian=2, traffic_cone=2, trailer=6, truck=3)) default_hooks = dict( checkpoint=dict(interval=-1, type='CheckpointHook'), logger=dict(interval=50, type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), sampler_seed=dict(type='DistSamplerSeedHook'), timer=dict(type='IterTimerHook'), visualization=dict( draw=True, score_thr=0.1, show=True, test_out_dir='results', type='Det3DVisualizationHook', vis_task='multi-view_det', wait_time=2)) default_scope = 'mmdet3d' env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) eval_pipeline = [ dict( backend_args=None, coord_type='LIDAR', load_dim=5, type='LoadPointsFromFile', use_dim=5), dict( backend_args=None, sweeps_num=10, test_mode=True, type='LoadPointsFromMultiSweeps'), dict(keys=[ 'points', ], type='Pack3DDetInputs'), ] find_unused_parameters = False ida_aug_conf = dict( H=900, W=1600, bot_pct_lim=( 0.0, 0.0, ), final_dim=( 320, 800, ), rand_flip=True, resize_lim=( 0.47, 0.625, ), rot_lim=( 0.0, 0.0, )) img_norm_cfg = dict( mean=[ 103.53, 116.28, 123.675, ], std=[ 57.375, 57.12, 58.395, ], to_rgb=False) input_modality = dict(use_camera=True, use_lidar=True) launcher = 'none' load_from = 'checkpoints/petr_vovnet_gridmask_p4_800x320-e2191752.pth' log_level = 'INFO' log_processor = dict(by_epoch=True, type='LogProcessor', window_size=50) lr = 0.0001 metainfo = dict(classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ]) model = dict( data_preprocessor=dict( bgr_to_rgb=False, mean=[ 103.53, 116.28, 123.675, ], pad_size_divisor=32, std=[ 57.375, 57.12, 58.395, ], type='Det3DDataPreprocessor'), img_backbone=dict( frozen_stages=-1, input_ch=3, norm_eval=True, out_features=( 'stage4', 'stage5', ), spec_name='V-99-eSE', type='VoVNetCP'), img_neck=dict( in_channels=[ 768, 1024, ], num_outs=2, out_channels=256, type='CPFPN'), pts_bbox_head=dict( LID=True, bbox_coder=dict( max_num=300, num_classes=10, pc_range=[ -51.2, -51.2, -5.0, 51.2, 51.2, 3.0, ], post_center_range=[ -61.2, -61.2, -10.0, 61.2, 61.2, 10.0, ], type='NMSFreeCoder', voxel_size=[ 0.2, 0.2, 8, ]), in_channels=256, loss_bbox=dict(loss_weight=0.25, type='mmdet.L1Loss'), loss_cls=dict( alpha=0.25, gamma=2.0, loss_weight=2.0, type='mmdet.FocalLoss', use_sigmoid=True), loss_iou=dict(loss_weight=0.0, type='mmdet.GIoULoss'), normedlinear=False, num_classes=10, num_query=900, position_range=[ -61.2, -61.2, -10.0, 61.2, 61.2, 10.0, ], positional_encoding=dict( normalize=True, num_feats=128, type='SinePositionalEncoding3D'), transformer=dict( decoder=dict( num_layers=6, return_intermediate=True, transformerlayers=dict( attn_cfgs=[ dict( attn_drop=0.1, dropout_layer=dict(drop_prob=0.1, type='Dropout'), embed_dims=256, num_heads=8, type='MultiheadAttention'), dict( attn_drop=0.1, dropout_layer=dict(drop_prob=0.1, type='Dropout'), embed_dims=256, num_heads=8, type='PETRMultiheadAttention'), ], feedforward_channels=2048, ffn_dropout=0.1, operation_order=( 'self_attn', 'norm', 'cross_attn', 'norm', 'ffn', 'norm', ), type='PETRTransformerDecoderLayer'), type='PETRTransformerDecoder'), type='PETRTransformer'), type='PETRHead', with_multiview=True, with_position=True), train_cfg=dict( pts=dict( assigner=dict( cls_cost=dict(type='FocalLossCost', weight=2.0), iou_cost=dict(type='IoUCost', weight=0.0), pc_range=[ -51.2, -51.2, -5.0, 51.2, 51.2, 3.0, ], reg_cost=dict(type='BBox3DL1Cost', weight=0.25), type='HungarianAssigner3D'), grid_size=[ 512, 512, 1, ], out_size_factor=4, point_cloud_range=[ -51.2, -51.2, -5.0, 51.2, 51.2, 3.0, ], voxel_size=[ 0.2, 0.2, 8, ])), type='PETR', use_grid_mask=True) num_epochs = 24 optim_wrapper = dict( clip_grad=dict(max_norm=35, norm_type=2), optimizer=dict(lr=0.0002, type='AdamW', weight_decay=0.01), paramwise_cfg=dict(custom_keys=dict(img_backbone=dict(lr_mult=0.1))), type='OptimWrapper') param_scheduler = [ dict( begin=0, by_epoch=False, end=500, start_factor=0.3333333333333333, type='LinearLR'), dict(T_max=24, by_epoch=True, type='CosineAnnealingLR'), ] point_cloud_range = [ -51.2, -51.2, -5.0, 51.2, 51.2, 3.0, ] randomness = dict(deterministic=False, diff_rank_seed=False, seed=1) resume = False test_cfg = dict() test_dataloader = dict( batch_size=1, dataset=dict( ann_file='nuscenes_infos_val.pkl', backend_args=None, box_type_3d='LiDAR', data_prefix=dict( CAM_BACK='samples/CAM_BACK', CAM_BACK_LEFT='samples/CAM_BACK_LEFT', CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT', CAM_FRONT='samples/CAM_FRONT', CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT', CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT', img='', pts='samples/LIDAR_TOP', sweeps='sweeps/LIDAR_TOP'), data_root='data/nuscenes/', metainfo=dict(classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ]), modality=dict(use_camera=True, use_lidar=True), pipeline=[ dict( backend_args=None, to_float32=True, type='LoadMultiViewImageFromFiles'), dict( data_aug_conf=dict( H=900, W=1600, bot_pct_lim=( 0.0, 0.0, ), final_dim=( 320, 800, ), rand_flip=True, resize_lim=( 0.47, 0.625, ), rot_lim=( 0.0, 0.0, )), training=False, type='ResizeCropFlipImage'), dict(keys=[ 'img', ], type='Pack3DDetInputs'), ], test_mode=True, type='NuScenesDataset', use_valid_flag=True), drop_last=False, num_workers=1, persistent_workers=True, sampler=dict(shuffle=False, type='DefaultSampler')) test_evaluator = dict( ann_file='data/nuscenes/nuscenes_infos_val.pkl', backend_args=None, data_root='data/nuscenes/', metric='bbox', type='NuScenesMetric') test_pipeline = [ dict( backend_args=None, to_float32=True, type='LoadMultiViewImageFromFiles'), dict( data_aug_conf=dict( H=900, W=1600, bot_pct_lim=( 0.0, 0.0, ), final_dim=( 320, 800, ), rand_flip=True, resize_lim=( 0.47, 0.625, ), rot_lim=( 0.0, 0.0, )), training=False, type='ResizeCropFlipImage'), dict(keys=[ 'img', ], type='Pack3DDetInputs'), ] train_cfg = dict(by_epoch=True, max_epochs=24, val_interval=24) train_dataloader = dict( batch_size=1, dataset=dict( ann_file='nuscenes_infos_train.pkl', backend_args=None, box_type_3d='LiDAR', data_prefix=dict( CAM_BACK='samples/CAM_BACK', CAM_BACK_LEFT='samples/CAM_BACK_LEFT', CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT', CAM_FRONT='samples/CAM_FRONT', CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT', CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT', img='', pts='samples/LIDAR_TOP', sweeps='sweeps/LIDAR_TOP'), data_root='data/nuscenes/', metainfo=dict(classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ]), modality=dict(use_camera=True, use_lidar=True), pipeline=[ dict( backend_args=None, to_float32=True, type='LoadMultiViewImageFromFiles'), dict( type='LoadAnnotations3D', with_attr_label=False, with_bbox_3d=True, with_label_3d=True), dict( point_cloud_range=[ -51.2, -51.2, -5.0, 51.2, 51.2, 3.0, ], type='ObjectRangeFilter'), dict( classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ], type='ObjectNameFilter'), dict( data_aug_conf=dict( H=900, W=1600, bot_pct_lim=( 0.0, 0.0, ), final_dim=( 320, 800, ), rand_flip=True, resize_lim=( 0.47, 0.625, ), rot_lim=( 0.0, 0.0, )), training=True, type='ResizeCropFlipImage'), dict( reverse_angle=False, rot_range=[ -0.3925, 0.3925, ], scale_ratio_range=[ 0.95, 1.05, ], training=True, translation_std=[ 0, 0, 0, ], type='GlobalRotScaleTransImage'), dict( keys=[ 'img', 'gt_bboxes', 'gt_bboxes_labels', 'attr_labels', 'gt_bboxes_3d', 'gt_labels_3d', 'centers_2d', 'depths', ], type='Pack3DDetInputs'), ], test_mode=False, type='NuScenesDataset', use_valid_flag=True), num_workers=4, persistent_workers=True, sampler=dict(shuffle=True, type='DefaultSampler')) train_pipeline = [ dict( backend_args=None, to_float32=True, type='LoadMultiViewImageFromFiles'), dict( type='LoadAnnotations3D', with_attr_label=False, with_bbox_3d=True, with_label_3d=True), dict( point_cloud_range=[ -51.2, -51.2, -5.0, 51.2, 51.2, 3.0, ], type='ObjectRangeFilter'), dict( classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ], type='ObjectNameFilter'), dict( data_aug_conf=dict( H=900, W=1600, bot_pct_lim=( 0.0, 0.0, ), final_dim=( 320, 800, ), rand_flip=True, resize_lim=( 0.47, 0.625, ), rot_lim=( 0.0, 0.0, )), training=True, type='ResizeCropFlipImage'), dict( reverse_angle=False, rot_range=[ -0.3925, 0.3925, ], scale_ratio_range=[ 0.95, 1.05, ], training=True, translation_std=[ 0, 0, 0, ], type='GlobalRotScaleTransImage'), dict( keys=[ 'img', 'gt_bboxes', 'gt_bboxes_labels', 'attr_labels', 'gt_bboxes_3d', 'gt_labels_3d', 'centers_2d', 'depths', ], type='Pack3DDetInputs'), ] val_cfg = dict() val_dataloader = dict( batch_size=1, dataset=dict( ann_file='nuscenes_infos_val.pkl', backend_args=None, box_type_3d='LiDAR', data_prefix=dict( CAM_BACK='samples/CAM_BACK', CAM_BACK_LEFT='samples/CAM_BACK_LEFT', CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT', CAM_FRONT='samples/CAM_FRONT', CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT', CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT', img='', pts='samples/LIDAR_TOP', sweeps='sweeps/LIDAR_TOP'), data_root='data/nuscenes/', metainfo=dict(classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone', ]), modality=dict(use_camera=True, use_lidar=True), pipeline=[ dict( backend_args=None, to_float32=True, type='LoadMultiViewImageFromFiles'), dict( data_aug_conf=dict( H=900, W=1600, bot_pct_lim=( 0.0, 0.0, ), final_dim=( 320, 800, ), rand_flip=True, resize_lim=( 0.47, 0.625, ), rot_lim=( 0.0, 0.0, )), training=False, type='ResizeCropFlipImage'), dict(keys=[ 'img', ], type='Pack3DDetInputs'), ], test_mode=True, type='NuScenesDataset', use_valid_flag=True), drop_last=False, num_workers=1, persistent_workers=True, sampler=dict(shuffle=False, type='DefaultSampler')) val_evaluator = dict( ann_file='data/nuscenes/nuscenes_infos_val.pkl', backend_args=None, data_root='data/nuscenes/', metric='bbox', type='NuScenesMetric') vis_backends = [ dict(type='LocalVisBackend'), ] visualizer = dict( name='visualizer', type='Det3DLocalVisualizer', vis_backends=[ dict(type='LocalVisBackend'), ]) voxel_size = [ 0.2, 0.2, 8, ] work_dir = './work_dirs/petr_vovnet_gridmask_p4_800x320'

05/01 01:05:13 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used. /home/user/mmdetection3d/mmdet3d/engine/hooks/visualization_hook.py:75: UserWarning: The show is True, it means that only the prediction results are visualized without storing data, so vis_backends needs to be excluded. warnings.warn('The show is True, it means that only ' 05/01 01:05:13 - mmengine - INFO - Autoplay mode, press [SPACE] to pause. 05/01 01:05:13 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook
(BELOW_NORMAL) LoggerHook

before_train: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(VERY_LOW ) CheckpointHook

before_train_epoch: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(NORMAL ) DistSamplerSeedHook

before_train_iter: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook

after_train_iter: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook

after_train_epoch: (NORMAL ) IterTimerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook

before_val: (VERY_HIGH ) RuntimeInfoHook

before_val_epoch: (NORMAL ) IterTimerHook

before_val_iter: (NORMAL ) IterTimerHook

after_val_iter: (NORMAL ) IterTimerHook
(NORMAL ) Det3DVisualizationHook
(BELOW_NORMAL) LoggerHook

after_val_epoch: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook

after_val: (VERY_HIGH ) RuntimeInfoHook

after_train: (VERY_HIGH ) RuntimeInfoHook
(VERY_LOW ) CheckpointHook

before_test: (VERY_HIGH ) RuntimeInfoHook

before_test_epoch: (NORMAL ) IterTimerHook

before_test_iter: (NORMAL ) IterTimerHook

after_test_iter: (NORMAL ) IterTimerHook
(NORMAL ) Det3DVisualizationHook
(BELOW_NORMAL) LoggerHook

after_test_epoch: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook

after_test: (VERY_HIGH ) RuntimeInfoHook

after_run: (BELOW_NORMAL) LoggerHook

05/01 01:05:27 - mmengine - INFO - ------------------------------ 05/01 01:05:27 - mmengine - INFO - The length of test dataset: 6019 05/01 01:05:27 - mmengine - INFO - The number of instances per category in the dataset: +----------------------+--------+ | category | number | +----------------------+--------+ | car | 80004 | | truck | 15704 | | construction_vehicle | 2678 | | bus | 3158 | | trailer | 4159 | | barrier | 26992 | | motorcycle | 2508 | | bicycle | 2381 | | pedestrian | 34347 | | traffic_cone | 15597 | +----------------------+--------+ /home/user/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/eval.py:10: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41): Loads checkpoint by local backend from path: checkpoints/petr_vovnet_gridmask_p4_800x320-e2191752.pth 05/01 01:05:28 - mmengine - INFO - Load checkpoint from checkpoints/petr_vovnet_gridmask_p4_800x320-e2191752.pth /home/user/miniconda3/envs/venv/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1695391896527/work/aten/src/ATen/native/TensorShape.cpp:3526.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] Traceback (most recent call last): File "tools/test.py", line 149, in main() File "tools/test.py", line 145, in main runner.test() File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1823, in test metrics = self.test_loop.run() # type: ignore File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/mmengine/runner/loops.py", line 445, in run self.run_iter(idx, data_batch) File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/mmengine/runner/loops.py", line 466, in run_iter self.runner.call_hook( File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1839, in call_hook getattr(hook, fn_name)(self, kwargs) File "/home/user/mmdetection3d/mmdet3d/engine/hooks/visualization_hook.py", line 228, in after_test_iter self._visualizer.add_datasample( File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/mmengine/dist/utils.py", line 427, in wrapper return func(args, kwargs) File "/home/user/mmdetection3d/mmdet3d/visualization/local_visualizer.py", line 1034, in add_datasample pred_instances_3d = pred_instances_3d[ File "/home/user/miniconda3/envs/venv/lib/python3.8/site-packages/mmengine/structures/instance_data.py", line 201, in getitem assert len(item) == len(self), 'The shape of the ' \ AssertionError: The shape of the input(BoolTensor) 300 does not match the shape of the indexed tensor in results_field 0 at first dimension.

Additional information

I tried to use a pretrained PETR model for testing/inference to produce bounding boxes for the NuScenes dataset. Because I don't have a GUI available, I want to save them in a file. Using the default PETR config and the checkpoint available here, I was unable to save the results. The testing process itself worked fine, printing out the metrics at the end (without saving the results, so without using --save-dir..). I also tried to use the Inference API to create bounding boxes for a sample image, however, the API doesn't seem to support multi-view 3D detection.

Any recommendations on how to properly save bounding boxes/results from a pretrained PETR model?

open-mmlab / mmdetection3d