fundamentalvision / BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
https://arxiv.org/abs/2203.17270
Apache License 2.0
3.27k stars 531 forks source link

pre-trained model and some questions #263

Open wuhen777 opened 4 months ago

wuhen777 commented 4 months ago
          The same as BEVFormerV1, see DD3D. https://github.com/TRI-ML/dd3d

Originally posted by @htian01 in https://github.com/fundamentalvision/BEVFormer/issues/245#issuecomment-2092776234

       After changing the network to vovnet99, I used the weight you mentioned. The map values for the first 8 rounds on the mini dataset were all 0, but there were still values for the first 8 rounds using resnet50. Although the values were small, they were still present. I also printed some output values in the code, and except for the changes in the two parameters samples.per_gpu=2 and workers_per_gpu=4 in bevformer_tiny.py, nothing else was changed.

The model weight selection is the second v2-99 in dd3d       

image

This is the result of my printout ResNet50: Backbone output shapes: Torch Size ([18, 2048, 15, 25]) FPN output shapes: Torch Size ([18, 256, 15, 25]) Pts'features: [torch. Size ([3, 6, 256, 15, 25])] Prev'bev: torch Size ([3, 2500, 256]) Bev_embed: torch Size ([2500, 3, 256]) All_cls_scores: torch Size ([6, 3, 900, 10]) All_bbox_preds: torch Size ([6, 3, 900, 10]) Enc_cls_scores is None Enc_bbox_preds is None VoVNet99: Backbone output shapes: Stage 5: torch Size ([24, 1024, 15, 25]) FPN output shapes: Torch Size ([24, 256, 15, 25]) Backbone output shapes: Stage 5: torch Size ([12, 1024, 15, 25]) FPN output shapes: Torch Size ([12, 256, 15, 25]) Pts'features: [torch. Size ([2, 6, 256, 15, 25])] Prev'bev: torch Size ([2, 2500, 256]) Bev_embed: torch Size ([2500, 2, 256]) All_cls_scores: torch Size ([6, 2, 900, 10]) All_bbox_preds: torch Size ([6, 2, 900, 10]) Enc_cls_scores is None Enc_bbox_preds is None This is the result of the 10th round of training using bevformer_tiny on the mini dataset MAP: 0.0001 MATE: 1.0694 MASE: 0.8686 MAOE: 1.0980 MAVE: 0.8908 MAAE: 0.8545 NDS: 0.0387 Eval time: 11.3s Per class results: Object Class AP ATE ASE AOE AVE AAE Car 0.001 1.303 0.334 1.369 0.297 0.112 Truck 0.000 1.000 1.000 1.000 1.000 1.000 1.000 Bus 0.000 1.000 1.000 1.000 1.000 1.000 1.000 Trailer 0.000 1.000 1.000 1.000 1.000 1.000 Construction_vehicle 0.000 1.000 1.000 1.000 1.000 1.000 Pedestrian 0.000 1.201 0.421 1.513 0.830 0.724 Motorcycle 0.000 1.000 1.000 1.000 1.000 1.000 Cycle 0.000 1.000 1.000 1.000 1.000 1.000 1.000 Traffic_cone 0.000 1.190 0.930 nan nan nan Barrier 0.000 1.000 1.000 1.000 nan nan 2024-05-15 13:00:07513- mmdet - INFO - Exp name: bevformer_tiny.py 2024-05-15 13:00:07513- mmdet - INFO - Epoch (val) [10] [81] pts_bbox-NuScenes/car-AP_dist:0.5: 0.0000, pts_bbox-NuScenes/car-AP_dist_1.0: 0.0000, pts_bbox-NuScenes/car-AP_dist_2.0: 0.0000, pts_bbox-NuScenes/car-AP_dist_4.0: 0.0052, pts_bbox-NuScenes/car_trans_err: 1.3027, pts_bbox-NuScenes s/car_scale-err: 0.3343, pts-bbox-NuScenes/carreorient_err: 1.3688, pts-bbox-NuScenes/car_vel_err: 0.2967, pts-bbox-NuScenes/car-attr_err: 0.1115, pts-bbox-NuScenes/mATE: 1.0694, pts-bbox-NuScenes/mASE: 0.8686, pts-bbox-NuScenes/mAOE: 1.0980, pts-bbox-NuScenes/mAVE: 0.8908, pts-bbox-NuScenes/mASE: 0.8686, pts-bbox-NuScenes/mAOE: 1.0980, pts-bbox-NuScenes/mAVE: 0.8908, ox NuScenes/mAAE: 0.8545, pts bbox NuScenes/truck-AP_dist_0.5: 0.0000, pts bbox NuScenes/truck-AP_dist_1.0: 0.0000, pts bbox NuScenes/truck-AP_dist_2.0: 0.0000, Pts'bbox-NuScenes/truck-AP_dist_4.0: 0.0000, pts'bbox-NuScenes/truck_trans_err: 1.0000, pts'bbox-NuScenes/truck_scaleerr: 1.0000, pts'bbox-NuScenes/truck_orient_err: 1.0000, pts'bbox-NuScenes/truck_vel_err: 1.0000, pts'bbox-NuScenes/truck-attr_err: 1.0000, pts'bbox-NuScenes/construction_ve hicle-AP_dist:0.5: 0.0000, pts_bbox-NuScenes/construction_vehicle-AP_dist_1.0: 0.0000, pts_bbox-NuScenes/construction_vehicle-AP_dist_2.0: 0.0000, pts_bbox-NuScenes/construction_vehicle-AP_dist_4.0: 0.0000, pts_bbox-NuScenes/construction_vehicle_trans_err: 1.0000, pts_bbox-NuScenes/construction_vehicle_scale-err: 1.0000 000, pts bbox NuScenes/construction_vehiclereorient_err: 1.0000, pts bbox NuScenes/construction_vehicle_vel_err: 1.0000, Pts'bbox-NuScenes/construction_vehicle-attr_err: 1.0000, pts'bbox-NuScenes/bus-AP_dist:0.5: 0.0000, pts'bbox-NuScenes/bus-AP_dist_1.0: 0.0000, pts'bbox-NuScenes/bus-AP_dist_2.0: 0.0000, pts'bbox-NuScenes/bus-AP_dist_4.0: 0.0000, pts'bbox-NuScenes/bus_trans_err: 1.0000, pts'bbox-NuScenes/bus_scale-err: 1.0000 000, pts bbox NuScenes/busreorient_err: 1.0000, pts bbox NuScenes/bus_vel_err: 1.0000, pts bbox NuScenes/bus-attr_err: 1.0000, pts bbox NuScenes/trailer-AP_dist:0.5: 0.0000, pts bbox NuScenes/trailer-AP_dist_1.0: 0.0000, pts bbox NuScenes/trailer-AP_dist_2.0: 0.0000, pts bbox NuScenes/trailer-AP_dist_4.000: 0.0000, pts'bbox-NuScenes/trailer_trans_err: 1.0000, pts'bbox-NuScenes/trailer_scale-err: 1.0000, Pts'bbox-NuScenes/trailerreorient_err: 1.0000, pts'bbox-NuScenes/trailer_vel_err: 1.0000, pts'bbox-NuScenes/trailer-attr_err: 1.0000, pts-bbox-NuScenes/barrier-AP_dist:0.5: 0.0000, pts-bbox-NuScenes/barrier-AP_dist_1.0: 0.0000, pts-bbox-NuScenes/barrier-AP_dist_2.0: 0.0000 P_dist_4.0: 0.0000, pts'bbox-NuScenes/barrier_trans_err: 1.0000, pts'bbox-NuScenes/barrier_scale-err: 1.0000, pts'bbox-NuScenes/barrier_orient_err: 1.0000, pts'bbox-NuScenes/barrier_vel_err: nan, pts'bbox-NuScenes/motorcycleAP_dist:0.5: 0.0000 uScenes/motorcycle-AP_dist_1.0: 0.0000, pts'bbox-NuScenes/motorcycle-AP_dist_2.0: 0.0000, pts'bbox-NuScenes/motorcycle-AP_dist_4.0: 0.0000, Pts'bbox-NuScenes/motorcycle-trans_err: 1.0000, pts'bbox-NuScenes/motorcycle-scale_err: 1.0000, pts'bbox-NuScenes/motorcycle-orient_err: 1.0000, pts'bbox-NuScenes/motorcycle-vel_err: 1.0000, pts'bbox-NuScenes/motorcycle-attr_err cle-AP_dist_1.0: 0.0000, pts_bbox-NuScenes/bicycle-AP_dist_2.0: 0.0000, pts_bbox-NuScenes/bicycle-AP_dist_4.0: 0.0000, pts_bbox-NuScenes/bicycle_trans_err: 1.0000, pts_bbox-NuScenes/bicycle_scale-err: 1.0000 NuScenes/bicycle-attr_err: 1.0000, pts_bbox-NuScenes/pedestrian-AP_dist_0.5: 0.0000, pts_bbox-NuScenes/pedestrian-AP_dist_1.0: 0.0000, Pts'bbox-NuScenes/Pedestrian AP_dist_2.0: 0.0000, pts'bbox-NuScenes/Pedestrian AP_dist_4.0: 0.0000, pts-bbox-NuScenes/Pedestrian trans err: 1.2012, pts-bbox-NuScenes/Pedestrian scale err: 0.4211, pts-bbox-NuScenes/Pedestrian orientation err: 1.5132, pts-bbox-NuScenes/Pedestrian level err: 0.8300, pts-bbox-NuScenes/Pedestrian level err Scenes/pedestrian-attr_err: 0.7244, pts'bbox-NuScenes/traffic_cone-AP_dist:0.5: 0.0000, pts'bbox-NuScenes/traffic_cone-AP_dist_1.0: 0.0000, pts-bbox-NuScenes/traffic_cone-AP_dist_2.0: 0.0000, pts-bbox-NuScenes/traffic_cone-AP_dist_4.0: 0.0000, pts-bbox-NuScenes/traffic_cone_trans_err: 1.1904, pts-bbox-NuScenes Scenes/traffic_cone_scale-err: 0.9302, pts_bbox-NuScenes/traffic_cone_orient_err: nan, pts_bbox-NuScenes/traffic_cone_vel_err: nan, Pts bbox NuScenes/traffic_cone attr err: nan, pts bbox NuScenes/NDS: 0.0387, pts bbox NuScenes/mAP: 0.0001

The code screenshot looks like this image

I would like to know where the specific problem lies and look forward to your reply. Thank you very much

wuhen777 commented 4 months ago

image image