hustvl / MapTR

[ICLR'23 Spotlight & IJCV'24] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
MIT License
1.08k stars 163 forks source link

get_bev_features 函数中,有一个变量 bev_embed 被赋值为一个字符串类型,而ret_dict 实际上是一个张量,怎么办? #176

Open yuanryann opened 3 months ago

yuanryann commented 3 months ago

2024-06-14 09:46:00,707 - mmdet - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) CosineAnnealingLrUpdaterHook
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


before_train_epoch: (VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) DistSamplerSeedHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


before_train_iter: (VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook


after_train_iter: (ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


after_train_epoch: (NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


before_val_epoch: (NORMAL ) DistSamplerSeedHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


before_val_iter: (LOW ) IterTimerHook


after_val_iter: (LOW ) IterTimerHook


after_val_epoch: (VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


after_run: (VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook


2024-06-14 09:46:00,708 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs 2024-06-14 09:46:00,709 - mmdet - INFO - Checkpoints will be saved to /home/com0179/AI/MapTR/work_dirs/maptr_tiny_r50_24e by HardDiskBackend. /home/com0179/AI/MapTR/projects/mmdet3d_plugin/models/utils/grid_mask.py:114: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:180.) mask = torch.from_numpy(mask).to(x.dtype).cuda() ret_dict: tensor([[[ 0.1889, -0.1615, 2.1812, ..., -1.0996, -1.4039, 0.6895], [ 1.4819, -0.6015, 0.8437, ..., -0.6222, -0.7078, 0.7483], [ 0.4329, 0.2124, 1.4304, ..., -1.9791, -0.9732, 0.6492], ..., [-0.3204, -0.4688, 0.5317, ..., -1.9080, -0.5561, 0.6536], [ 0.4254, -0.1113, 1.2542, ..., -1.9874, -0.6516, 1.0486], [-0.2349, 0.8355, 0.9105, ..., -1.3129, 0.1006, 1.3759]],

    [[-0.2733,  0.0749,  0.9204,  ...,  0.9150, -0.3261,  0.0139],
     [ 1.3868, -0.3957,  0.8588,  ..., -1.4051, -0.0948,  0.3878],
     [ 0.8097,  0.7675,  0.6791,  ..., -0.4050, -0.3664, -0.3884],
     ...,
     [-1.0428, -0.7296,  0.3283,  ..., -2.0839, -0.6283,  1.3728],
     [-0.5850, -0.4228,  0.1651,  ..., -1.4061, -0.2002,  0.2984],
     [-0.8431,  1.0897,  0.4802,  ..., -1.9049, -0.2679,  1.8028]],

    [[ 0.7818, -0.6220,  1.4299,  ..., -1.4584, -2.0435,  0.2221],
     [ 1.0930, -0.2832,  0.5768,  ..., -0.3528, -0.5643,  0.1527],
     [ 0.7040, -0.0652,  1.5784,  ..., -1.1005, -0.4832, -0.1628],
     ...,
     [-0.7733, -1.2431,  0.6865,  ..., -2.4375, -0.8437,  1.2103],
     [-0.0844, -0.8666,  1.0173,  ..., -1.3839, -0.5428,  0.8602],
     [-0.2918,  0.1805,  0.2343,  ..., -0.1657, -0.3963,  1.7632]],

    [[ 0.8106,  0.2636,  1.1491,  ..., -0.6950, -0.6393,  0.6001],
     [ 1.6005, -0.2310,  1.1513,  ..., -0.4952, -0.2108,  0.5619],
     [ 0.4873,  0.1370,  0.7079,  ..., -0.9651, -0.5468,  0.6746],
     ...,
     [-0.8568, -1.1599,  0.2693,  ..., -2.6332, -1.6124,  1.2802],
     [ 0.1471,  0.2384,  0.8299,  ..., -1.7544, -0.6352,  1.3663],
     [ 0.3371,  1.3895,  0.4540,  ..., -1.4025, -0.7343,  1.7416]]],
   device='cuda:0', grad_fn=<NativeLayerNormBackward>)

Traceback (most recent call last): File "./tools/train.py", line 259, in main() File "./tools/train.py", line 248, in main custom_train_model( File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/bevformer/apis/train.py", line 27, in custom_train_model custom_train_detector( File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 199, in custom_train_detector runner.run(data_loaders, cfg.workflow) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], kwargs) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True, kwargs) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step output = self.module.train_step(inputs[0], kwargs[0]) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step losses = self(data) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, kwargs) File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 162, in forward return self.forward_train(kwargs) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func output = old_func(*new_args, new_kwargs) File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 277, in forward_train losses_pts = self.forward_pts_train(img_feats, lidar_feat, gt_bboxes_3d, File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 141, in forward_pts_train outs = self.pts_bbox_head( File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, *kwargs) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func output = old_func(new_args, new_kwargs) File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/dense_heads/maptr_head.py", line 254, in forward outputs = self.transformer( File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 372, in forward ouput_dic = self.get_bev_features( File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 267, in get_bev_features if 'bev' in ret_dict: File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/_tensor.py", line 670, in contains raise RuntimeError( RuntimeError: Tensor.contains only supports Tensor or scalar, but you passed in a <class 'str'>. ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1712135) of binary: /home/com0179/anaconda3/envs/MapTR/bin/python3 Traceback (most recent call last): File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in main() File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run elastic_launch( File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:


    ./tools/train.py FAILED        

=======================================

ADMIN-RyanLin commented 3 months ago

遇到过相似的问题,注意maptr版本的对应,v2要用v2的脚本