Owen-Liuyuxuan / visualDet3D

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/
Apache License 2.0
361 stars 76 forks source link

when i run "./launchers/eval.sh config/yolo_stereo.py 0 checkpoint/Stereo3D_latest.pth validation" or on test data #61

Closed monsters-s closed 2 years ago

monsters-s commented 2 years ago

当在测试集和验证集跑数据的时候,都会出现如下问题,不理解为什么NMS的输入的分数和检测框的长度会有不同

CUDA available: True pickle/Stereo3D/output/test/imdb.pkl Found evaluate function clean up the recorder directory of pickle/Stereo3D/output/test/data rebuild pickle/Stereo3D/output/test/data 0%| | 0/7518 [00:00<?, ?it/s]/home/lishengwen/code/visualDet3D/visualDet3D/networks/lib/PSM_cost_volume.py:82: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. cost = Variable( PSM Cos Volume takes 0.002985239028930664 seconds at call time 1 /home/lishengwen/code/visualDet3D/visualDet3D/networks/lib/PSM_cost_volume.py:49: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. cost = Variable( 0%| | 1/7518 [00:00<43:24, 2.89it/s]PSM Cos Volume takes 0.0035042762756347656 seconds at call time 2 PSM Cos Volume takes 0.0029206275939941406 seconds at call time 3 Cost Volume takes 0.0025892257690429688 seconds at call time 1 0%| | 2/7518 [00:00<29:20, 4.27it/s]PSM Cos Volume takes 0.0034818649291992188 seconds at call time 4 PSM Cos Volume takes 0.002855062484741211 seconds at call time 5 Cost Volume takes 0.0025739669799804688 seconds at call time 2 0%| | 3/7518 [00:00<23:13, 5.39it/s]PSM Cos Volume takes 0.0034427642822265625 seconds at call time 6 PSM Cos Volume takes 0.0028700828552246094 seconds at call time 7 Cost Volume takes 0.002550363540649414 seconds at call time 3 0%| | 4/7518 [00:00<21:47, 5.75it/s]PSM Cos Volume takes 0.0047397613525390625 seconds at call time 8 PSM Cos Volume takes 0.005141258239746094 seconds at call time 9 Cost Volume takes 0.00797271728515625 seconds at call time 4 0%| | 5/7518 [00:00<22:43, 5.51it/s]PSM Cos Volume takes 0.003518819808959961 seconds at call time 10 PSM Cos Volume takes 0.002848386764526367 seconds at call time 11 Cost Volume takes 0.0024688243865966797 seconds at call time 5 0%| | 6/7518 [00:01<22:04, 5.67it/s]PSM Cos Volume takes 0.004822254180908203 seconds at call time 12 PSM Cos Volume takes 0.003919839859008789 seconds at call time 13 Cost Volume takes 0.007089376449584961 seconds at call time 6 0%| | 7/7518 [00:01<22:03, 5.67it/s]PSM Cos Volume takes 0.0034933090209960938 seconds at call time 14 PSM Cos Volume takes 0.0028336048126220703 seconds at call time 15 Cost Volume takes 0.0024688243865966797 seconds at call time 7 0%|▏ | 8/7518 [00:01<20:27, 6.12it/s]PSM Cos Volume takes 0.0042951107025146484 seconds at call time 16 PSM Cos Volume takes 0.005505800247192383 seconds at call time 17 Cost Volume takes 0.010698080062866211 seconds at call time 8 0%|▏ | 9/7518 [00:01<21:04, 5.94it/s]PSM Cos Volume takes 0.0034551620483398438 seconds at call time 18 PSM Cos Volume takes 0.002899646759033203 seconds at call time 19 Cost Volume takes 0.0024847984313964844 seconds at call time 9 2%|██▉ | 178/7518 [00:26<17:56, 6.82it/s] Traceback (most recent call last): File "scripts/eval.py", line 55, in fire.Fire(main) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "scripts/eval.py", line 52, in main evaluate_detection(cfg, detector, dataset, None, 0, result_path_split=split_to_test) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context return func(*args, *kwargs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/pipelines/evaluators.py", line 85, in evaluate_kitti_obj test_one(cfg, index, dataset_val, model, test_func, backprojector, projector, result_path) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/pipelines/evaluators.py", line 111, in test_one scores, bbox, obj_names = test_func(collated_data, model, None, cfg=cfg) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context return func(args, kwargs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/pipelines/testers.py", line 39, in test_stereo_detection scores, bbox, obj_index = module([left_images.cuda().float().contiguous(), right_images.cuda().float().contiguous(), torch.tensor(P2).cuda().float(), torch.tensor(P3).cuda().float()]) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/detectors/yolostereo3d_detector.py", line 103, in forward return self.test_forward(inputs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/detectors/yolostereo3d_detector.py", line 93, in test_forward scores, bboxes, cls_indexes = self.bbox_head.get_bboxes(cls_preds, reg_preds, anchors, P2, left_images) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/heads/detection_3d_head.py", line 385, in get_bboxes keep_inds = nms(bboxes[:, :4], max_score, nms_iou_thr) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 42, in nms return torch.ops.torchvision.nms(boxes, scores, iou_threshold) RuntimeError: boxes and scores should have same number of elements in dimension 0, got 495 and 494

monsters-s commented 2 years ago

我在detection_3d_head.py的379行添加了如下代码可以正常运行, bboxes = bboxes[mask] 没有理解mask的功能是什么,可以成功运行后在验证集上的结果为如下,与论文中的结果差异不大,

Car AP(Average Precision)@0.70, 0.70, 0.70: bbox AP:99.96, 99.93, 74.94 bev AP:71.80, 54.20, 41.43 3d AP:66.47, 50.18, 38.01 aos AP:97.60, 97.59, 73.19 Car AP(Average Precision)@0.70, 0.50, 0.50: bbox AP:99.96, 99.93, 74.94 bev AP:93.91, 83.89, 62.34 3d AP:93.68, 81.27, 61.95 aos AP:97.60, 97.59, 73.19

Pedestrian AP(Average Precision)@0.50, 0.50, 0.50: bbox AP:86.69, 85.33, 71.21 bev AP:25.19, 25.81, 21.82 3d AP:24.35, 25.09, 19.95 aos AP:85.96, 84.31, 70.28 Pedestrian AP(Average Precision)@0.50, 0.25, 0.25: bbox AP:86.69, 85.33, 71.21 bev AP:78.79, 72.69, 60.73 3d AP:78.12, 72.16, 60.27 aos AP:85.96, 84.31, 70.28

Owen-Liuyuxuan commented 2 years ago

This is actually a known bug as an insider.

During precomputing, we gather statistics for each anchor shape, anchors of some shapes do not match any of the target object in the training set (like pedestrians & the wide anchors), and fail to gather a reasonable statistics. We do not use these anchors during inferencing and there comes the mask.

In most cases / random seeds, these anchors will naturally be suppressed and hardly produce predictions after nms / thresholding. So, this bug "usually" does not appear.

monsters-s commented 2 years ago

This is actually a known bug as an insider.

During precomputing, we gather statistics for each anchor shape, anchors of some shapes do not match any of the target object in the training set (like pedestrians & the wide anchors), and fail to gather a reasonable statistics. We do not use these anchors during inferencing and there comes the mask.

In most cases / random seeds, these anchors will naturally be suppressed and hardly produce predictions after nms / thresholding. So, this bug "usually" does not appear.

那我直接使用kitti数据集进行测试为什么会出现这个问题呢,,,我使用的验证集和测试集应该是和大家一样的, 如果添加这行代码“bboxes = bboxes[mask]”的话可以在不影响结果的情况下解决这个问题么

Owen-Liuyuxuan commented 2 years ago

This is actually a known bug as an insider. During precomputing, we gather statistics for each anchor shape, anchors of some shapes do not match any of the target object in the training set (like pedestrians & the wide anchors), and fail to gather a reasonable statistics. We do not use these anchors during inferencing and there comes the mask. In most cases / random seeds, these anchors will naturally be suppressed and hardly produce predictions after nms / thresholding. So, this bug "usually" does not appear.

那我直接使用kitti数据集进行测试为什么会出现这个问题呢,,,我使用的验证集和测试集应该是和大家一样的, 如果添加这行代码“bboxes = bboxes[mask]”的话可以在不影响结果的情况下解决这个问题么

其实你加这行代码就是正确解决这个bug的。我前面回复的意思就是在网络训练情况比较充分的时候大概率(不同随机初始化、不同随机开始训练情况不同)这个bug不会触发,但是你加的这个代码可以让这个bug逻辑上不会再发生。

monsters-s commented 2 years ago

谢谢解答,明白了😊