open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.29k stars 1.54k forks source link

[Bug] Motion compensation error during multi frame loading (Only happen in some scenes) #2164

Open fengjiang5 opened 1 year ago

fengjiang5 commented 1 year ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux Python: 3.7.15 (default, Nov 24 2022, 21:12:53) [GCC 11.2.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA Tesla V100-SXM2-32GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.1, V11.1.105 GCC: gcc (GCC) 7.5.0 PyTorch: 1.11.0 PyTorch compiling details: PyTorch built with:

TorchVision: 0.12.0 OpenCV: 4.6.0 MMCV: 1.6.2 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.3 MMDetection: 2.26.0 MMSegmentation: 0.29.1 MMDetection3D: 1.0.0rc6+ spconv2.0: False

Reproduces the problem - code sample

I usually offcial code without any changes.

Reproduces the problem - command or script

command or script

you can simply run this script. python tools/misc/browse_dataset.py configs/_base_/datasets/nus-3d.py --task det --aug --output-dir ${OUTPUT_DIR} --online

Wrong scenes

'./data/nuscenes/samples/LIDAR_TOP/n008-2018-05-21-11-06-59-0400__LIDAR_TOP__1526915300898622.pcd.bin' is a key sample with bug. From below, we can see that the car is correct, but ego-car may be very strange, and this will cause error when do projection to range-images. bugs2 bugs3

Correct scenes

./data/nuscenes/samples/LIDAR_TOP/n008-2018-05-21-11-06-59-0400__LIDAR_TOP__1526915243547836.pcd.bin is correct. We can see that this key sample can match with multi sweeps well. bug1

Reproduces the problem - error message

None

Additional information

  1. I think if someone use LoadPointsFromMultiSweeps in trainpipeline with nuScenes dataset, he should do `PointsRangeFilter' before fuse multi sweeps.
  2. I want to project the points into range-images, so if there is a problem at the points around the ego-car, this will cause huge problems.
  3. Thanks for your mmdet3d, and hope for support of range-based methods.
Tai-Wang commented 1 year ago

Could you please provide more information about the problematic case, such as the ego and global transformation matrices? And how about the frequency or ratio of these problematic scenes in the entire dataset? I think it's likely caused by some numerical problems or some corner cases.

fengjiang5 commented 1 year ago

Could you please provide more information about the problematic case, such as the ego and global transformation matrices? And how about the frequency or ratio of these problematic scenes in the entire dataset? I think it's likely caused by some numerical problems or some corner cases.

Hello, thank for your answer. But, in my opinions, it's not caused by some numerical problems or some corner cases.

  1. I used different version of mmdet3d, and I asked other friends to visualize the results after multi-sweeps fusion. We all get same results.

  2. I think this is caused by object moving. The following is the illustration. We can see that moving objects will lead to this situation. figure

  3. If we look closely at the static objects (tree or traffic signs) in the distance, we can find that they are correct.

So, I think this multi-frame fusion is correct. However, I want to realize 3D object detection based on range image. The problem mentioned above is so serious when do projection because of occlusions caused by fusion multi frame together. Do you have any advice or will mmdet3d support range-based detection ?

Tai-Wang commented 1 year ago

Points of moving objects definitely appear trailing because the alignment across different frames can just compensate for the ego-motion. It is this trailing phenomenon that may help the model to predict object motion. This problem, including trailing caused by the object motion and occlusion, I think, should be similar for BEV or range-view-based methods, and that is also why aggregating multiple frames can work in LiDAR-based methods. Range-based methods are in the future plan of MMDet3D, but we do not find many promising open-source works for now and we may support this stream of methods when we have enough resources and references (such as FCOS-LiDAR).