Bin-ze / BEVFormer_segmentation_detection

Implemented BEVFormer support for BEV segmentation
Apache License 2.0
88 stars 7 forks source link

训练问题 #26

Closed wuhen777 closed 2 months ago

wuhen777 commented 3 months ago

当我执行这个语句时:./tools/dist_train.sh ./projects/configs/bevformer/bevformer_small.py 1 遇到了下面的问题 /root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use_env is set by default in torchrun. If your script expects --local_rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( projects.mmdet3d_plugin fatal: not a git repository (or any parent up to mount point /root) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 2024-03-27 21:21:23,402 - mmdet - INFO - Environment info:

sys.platform: linux Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] CUDA available: True GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.3.r11.3/compiler.29920130_0 GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.10.0+cu113 PyTorch compiling details: PyTorch built with:

TorchVision: 0.11.0+cu113 OpenCV: 4.8.1 MMCV: 1.4.0 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.3 MMDetection: 2.14.0 MMSegmentation: 0.14.1 MMDetection3D: 0.17.1+

2024-03-27 21:21:24,735 - mmdet - INFO - Distributed training: True 2024-03-27 21:21:26,086 - mmdet - INFO - Config: point_cloud_range = [-51.2, -51.2, -5.0, 51.2, 51.2, 3.0] class_names = [ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ] dataset_type = 'CustomNuScenesDataset' data_root = 'data/nuscenes/' input_modality = dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True) file_client_args = dict(backend='disk') train_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict(type='PhotoMetricDistortionMultiViewImage'), dict( type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True, with_attr_label=False), dict( type='ObjectRangeFilter', point_cloud_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]), dict( type='ObjectNameFilter', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict( type='NormalizeMultiviewImage', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='RandomScaleImageMultiViewImage', scales=[0.8]), dict(type='PadMultiViewImage', size_divisor=32), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict(type='CustomCollect3D', keys=['gt_bboxes_3d', 'gt_labels_3d', 'img']) ] test_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='NormalizeMultiviewImage', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict( type='MultiScaleFlipAug3D', img_scale=(1600, 900), pts_scale_ratio=1, flip=False, transforms=[ dict(type='RandomScaleImageMultiViewImage', scales=[0.8]), dict(type='PadMultiViewImage', size_divisor=32), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict(type='CustomCollect3D', keys=['img']) ]) ] eval_pipeline = [ dict( type='LoadPointsFromFile', coord_type='LIDAR', load_dim=5, use_dim=5, file_client_args=dict(backend='disk')), dict( type='LoadPointsFromMultiSweeps', sweeps_num=10, file_client_args=dict(backend='disk')), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier' ], with_label=False), dict(type='Collect3D', keys=['points']) ] data = dict( samples_per_gpu=1, workers_per_gpu=0, train=dict( type='CustomNuScenesDataset', data_root='data/nuscenes/', ann_file='data/nuscenes/nuscenes_infos_temporal_train.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict(type='PhotoMetricDistortionMultiViewImage'), dict( type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True, with_attr_label=False), dict( type='ObjectRangeFilter', point_cloud_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]), dict( type='ObjectNameFilter', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict( type='NormalizeMultiviewImage', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='RandomScaleImageMultiViewImage', scales=[0.8]), dict(type='PadMultiViewImage', size_divisor=32), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict( type='CustomCollect3D', keys=['gt_bboxes_3d', 'gt_labels_3d', 'img']) ], classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True), test_mode=False, box_type_3d='LiDAR', use_valid_flag=True, bev_size=(150, 150), queue_length=3), val=dict( type='CustomNuScenesDataset', ann_file='data/nuscenes/nuscenes_infos_temporal_val.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='NormalizeMultiviewImage', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict( type='MultiScaleFlipAug3D', img_scale=(1600, 900), pts_scale_ratio=1, flip=False, transforms=[ dict(type='RandomScaleImageMultiViewImage', scales=[0.8]), dict(type='PadMultiViewImage', size_divisor=32), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict(type='CustomCollect3D', keys=['img']) ]) ], classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True), test_mode=True, box_type_3d='LiDAR', data_root='data/nuscenes/', bev_size=(150, 150), samples_per_gpu=1), test=dict( type='CustomNuScenesDataset', data_root='data/nuscenes/', ann_file='data/nuscenes/nuscenes_infos_temporal_val.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='NormalizeMultiviewImage', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict( type='MultiScaleFlipAug3D', img_scale=(1600, 900), pts_scale_ratio=1, flip=False, transforms=[ dict(type='RandomScaleImageMultiViewImage', scales=[0.8]), dict(type='PadMultiViewImage', size_divisor=32), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict(type='CustomCollect3D', keys=['img']) ]) ], classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True), test_mode=True, box_type_3d='LiDAR', bev_size=(150, 150)), shuffler_sampler=dict(type='DistributedGroupSampler'), nonshuffler_sampler=dict(type='DistributedSampler')) evaluation = dict( interval=1, pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='NormalizeMultiviewImage', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict( type='MultiScaleFlipAug3D', img_scale=(1600, 900), pts_scale_ratio=1, flip=False, transforms=[ dict(type='RandomScaleImageMultiViewImage', scales=[0.8]), dict(type='PadMultiViewImage', size_divisor=32), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict(type='CustomCollect3D', keys=['img']) ]) ]) checkpoint_config = dict(interval=1) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/bevformer_small' load_from = 'ckpts/r101_dcn_fcos3d_pretrain.pth' resume_from = None workflow = [('train', 1)] plugin = True plugin_dir = 'projects/mmdet3d_plugin/' voxel_size = [0.2, 0.2, 8] img_norm_cfg = dict( mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False) dim = 256 _posdim = 128 _ffndim = 512 _numlevels = 1 bevh = 150 bevw = 150 queue_length = 3 model = dict( type='BEVFormer', use_grid_mask=True, video_test_mode=True, img_backbone=dict( type='ResNet', depth=101, num_stages=4, out_indices=(3, ), frozen_stages=1, norm_cfg=dict(type='BN2d', requires_grad=False), norm_eval=True, style='caffe', with_cp=True, dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False), stage_with_dcn=(False, False, True, True)), img_neck=dict( type='FPN', in_channels=[2048], out_channels=256, start_level=0, add_extra_convs='on_output', num_outs=1, relu_before_extra_convs=True), pts_bbox_head=dict( type='BEVFormerHead', bev_h=150, bev_w=150, num_query=900, num_classes=10, in_channels=256, sync_cls_avg_factor=True, with_box_refine=True, as_two_stage=False, transformer=dict( type='PerceptionTransformer', rotate_prev_bev=True, use_shift=True, use_can_bus=True, embed_dims=256, encoder=dict( type='BEVFormerEncoder', num_layers=3, pc_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0], num_points_in_pillar=4, return_intermediate=False, transformerlayers=dict( type='BEVFormerLayer', attn_cfgs=[ dict( type='TemporalSelfAttention', embed_dims=256, num_levels=1), dict( type='SpatialCrossAttention', pc_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0], deformable_attention=dict( type='MSDeformableAttention3D', embed_dims=256, num_points=8, num_levels=1), embed_dims=256) ], feedforward_channels=512, ffn_dropout=0.1, operation_order=('self_attn', 'norm', 'cross_attn', 'norm', 'ffn', 'norm'))), decoder=dict( type='DetectionTransformerDecoder', num_layers=6, return_intermediate=True, transformerlayers=dict( type='DetrTransformerDecoderLayer', attn_cfgs=[ dict( type='MultiheadAttention', embed_dims=256, num_heads=8, dropout=0.1), dict( type='CustomMSDeformableAttention', embed_dims=256, num_levels=1) ], feedforward_channels=512, ffn_dropout=0.1, operation_order=('self_attn', 'norm', 'cross_attn', 'norm', 'ffn', 'norm')))), bbox_coder=dict( type='NMSFreeCoder', post_center_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0], pc_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0], max_num=300, voxel_size=[0.2, 0.2, 8], num_classes=10), positional_encoding=dict( type='LearnedPositionalEncoding', num_feats=128, row_num_embed=150, col_num_embed=150), loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=2.0), loss_bbox=dict(type='L1Loss', loss_weight=0.25), loss_iou=dict(type='GIoULoss', loss_weight=0.0)), train_cfg=dict( pts=dict( grid_size=[512, 512, 1], voxel_size=[0.2, 0.2, 8], point_cloud_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0], out_size_factor=4, assigner=dict( type='HungarianAssigner3D', cls_cost=dict(type='FocalLossCost', weight=2.0), reg_cost=dict(type='BBox3DL1Cost', weight=0.25), iou_cost=dict(type='IoUCost', weight=0.0), pc_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0])))) optimizer = dict( type='AdamW', lr=0.0002, paramwise_cfg=dict(custom_keys=dict(img_backbone=dict(lr_mult=0.1))), weight_decay=0.01) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) lr_config = dict( policy='CosineAnnealing', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, min_lr_ratio=0.001) total_epochs = 4 runner = dict(type='EpochBasedRunner', max_epochs=4) gpu_ids = range(0, 1)

2024-03-27 21:21:26,087 - mmdet - INFO - Set random seed to 0, deterministic: True Traceback (most recent call last): File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/utils/registry.py", line 52, in build_from_cfg return obj_cls(**args) File "/root/autodl-tmp/BEVFormer/projects/mmdet3d_plugin/bevformer/dense_heads/bevformer_head.py", line 156, in init super(BEVFormerHead, self).init( File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmdet/models/dense_heads/detr_head.py", line 149, in init self._init_layers() File "/root/autodl-tmp/BEVFormer/projects/mmdet3d_plugin/bevformer/dense_heads/bevformer_head.py", line 171, in _init_layers if self.task.get('det'): AttributeError: 'NoneType' object has no attribute 'get'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/utils/registry.py", line 52, in build_from_cfg return obj_cls(*args) File "/root/autodl-tmp/BEVFormer/projects/mmdet3d_plugin/bevformer/detectors/bevformer.py", line 46, in init super(BEVFormer, File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/models/detectors/mvx_two_stage.py", line 61, in init self.pts_bbox_head = builder.build_head(pts_bbox_head) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/models/builder.py", line 39, in build_head return HEADS.build(cfg) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/utils/registry.py", line 212, in build return self.build_func(args, **kwargs, registry=self) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') AttributeError: BEVFormerHead: 'NoneType' object has no attribute 'get'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./tools/train.py", line 262, in main() File "./tools/train.py", line 218, in main model = build_model( File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/models/builder.py", line 84, in build_model return build_detector(cfg, train_cfg=train_cfg, test_cfg=test_cfg) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/models/builder.py", line 57, in build_detector return DETECTORS.build( File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/utils/registry.py", line 212, in build return self.build_func(*args, **kwargs, registry=self) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') AttributeError: BEVFormer: BEVFormerHead: 'NoneType' object has no attribute 'get' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2143) of binary: /root/miniconda3/envs/bev/bin/python Traceback (most recent call last): File "/root/miniconda3/envs/bev/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/miniconda3/envs/bev/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in main() File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run elastic_launch( File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/root/miniconda3/envs/bev/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

./tools/train.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-03-27_21:21:27 host : autodl-container-ad7811803c-f508d49d rank : 0 (local_rank: 0) exitcode : 1 (pid: 2143) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html 说是与模型的构建有关,搜gpt说可能与 BEVFormerHead 或 BEVFormer 类中的某些属性或方法相关,想请问下是怎么解决的
Bin-ze commented 3 months ago

/tools/dist_train.sh ./projects/configs/bevformer/bevformer_small.py 1 这里选择一个其他的配置文件,你要执行什么任务就选择哪个,比如执行分割,或者执行分割+检测,分别运行我readme里指向的配置文件

wuhen777 commented 3 months ago

还有个问题想请教一下大佬,我想复现出您的地图分割图 因此我先执行./tools/dist_train.sh ./projects/configs/bevformer/bevformer_small_seg_det.py 1进行训练 然后执行./tools/dist_test.sh ./projects/configs/bevformer/bevformer_small_seg_det.py work_dirs/bevformer_small_seg_det/latest.pth 1进行测试评估 然后执行python visual_det_seg.py时,遇到了一个这个问题pred_seg_path = '/home/binze/work_code/BEVFormer/visual_small_seg_det',我不知道您这个工程目录我应该改成什么样子,执行命令时显示我没有visual_small_seg_det这个文件夹,我如果在BEVFormer下创建visual_small_seg_det文件夹的话,再次执行可视化会出现Traceback (most recent call last): File "visual_det_seg.py", line 537, in render_sample_data(sample_token_list[id], pred_data=bevformer_results, out_path=f"result_seg_det/{sample_token_list[id]}", seg_list=seg_list) File "visual_det_seg.py", line 460, in render_sample_data assert sample_toekn + '.png' in seg_list, '分割图必须存在!' AssertionError: 分割图必须存在! 想请问下这个生成的分割图在哪里

Bin-ze commented 3 months ago

运行./tools/dist_test.sh ./projects/configs/bevformer/bevformer_small_seg_det.py work_dirs/bevformer_small_seg_det/latest.pth 1进行测试评估会保存分割结果在BEVFormer/visual_small_seg_det中,你可以查看下是否存在这个文件夹。 后面需要将'/home/binze/work_code/BEVFormer/visual_small_seg_det改成你自己的路径

wuhen777 commented 3 months ago

大佬,我在运行./tools/dist_test.sh ./projects/configs/bevformer/bevformer_small_seg_det.py work_dirs/bevformer_small_seg_det/latest.pth 1之后 image 只会在当前目录下生成一个segmentation_result.json文件,并没有生成visual_small_seg_det文件夹,同时如果我按照下面这样, image 在当前目录下新建一个visual_small_seg_det文件夹之后,将segmentation_result.json文件拖进去,并修改路径,执行可视化代码还是会显示分割图必须存在这个错误 image 我该怎么办呀

Bin-ze commented 3 months ago

你需要运行 python debug_test.py ./projects/configs/bevformer/bevformer_small_seg_det.py

wuhen777 commented 3 months ago

image 大佬,最后的可视化结果是不是就像这样,清楚一点的是真实gt的地图分割,全是蓝色的是预测的结果(会不会是因为我只训练了1个轮次,训练量少并且使用的是mini数据集) 然后就是这些真实gt的数据是从哪来的呀,就包括原论文的可视化里面,也是有真实gt的,这些真实数据是从nus的雷达数据里面获得的吗

Bin-ze commented 3 months ago

你尝试下载我的权重试一下,我感觉是你的模型没有收敛,导致预测的大部分是错的

wuhen777 commented 3 months ago

image 这是我没有使用您的预训练权重训练的24轮的结果,我之后试一下下载您的预训练权重试一下

Bin-ze commented 3 months ago

应该是数据规模的问题,我觉得你在整个数据集上训练,应该很快就可以收敛。在我的实验中,单独训练分割分支,在第四个epoch就可以得到超过40%的map,超过了hdmapnet论文中的精度