[Bug] Motion compensation error during multi frame loading (Only happen in some scenes)

fengjiang5 commented 1 year ago

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (dev) or latest version (1.x).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux Python: 3.7.15 (default, Nov 24 2022, 21:12:53) [GCC 11.2.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA Tesla V100-SXM2-32GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.1, V11.1.105 GCC: gcc (GCC) 7.5.0 PyTorch: 1.11.0 PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0 OpenCV: 4.6.0 MMCV: 1.6.2 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.3 MMDetection: 2.26.0 MMSegmentation: 0.29.1 MMDetection3D: 1.0.0rc6+ spconv2.0: False

Reproduces the problem - code sample

I usually offcial code without any changes.

Reproduces the problem - command or script

command or script

you can simply run this script. python tools/misc/browse_dataset.py configs/_base_/datasets/nus-3d.py --task det --aug --output-dir ${OUTPUT_DIR} --online

Wrong scenes

'./data/nuscenes/samples/LIDAR_TOP/n008-2018-05-21-11-06-59-0400__LIDAR_TOP__1526915300898622.pcd.bin' is a key sample with bug. From below, we can see that the car is correct, but ego-car may be very strange, and this will cause error when do projection to range-images. bugs2 bugs3

Correct scenes

./data/nuscenes/samples/LIDAR_TOP/n008-2018-05-21-11-06-59-0400__LIDAR_TOP__1526915243547836.pcd.bin is correct. We can see that this key sample can match with multi sweeps well. bug1

Reproduces the problem - error message

None

Additional information

I think if someone use LoadPointsFromMultiSweeps in trainpipeline with nuScenes dataset, he should do `PointsRangeFilter' before fuse multi sweeps.
I want to project the points into range-images, so if there is a problem at the points around the ego-car, this will cause huge problems.
Thanks for your mmdet3d, and hope for support of range-based methods.

Tai-Wang commented 1 year ago

Could you please provide more information about the problematic case, such as the ego and global transformation matrices? And how about the frequency or ratio of these problematic scenes in the entire dataset? I think it's likely caused by some numerical problems or some corner cases.

fengjiang5 commented 1 year ago

Could you please provide more information about the problematic case, such as the ego and global transformation matrices? And how about the frequency or ratio of these problematic scenes in the entire dataset? I think it's likely caused by some numerical problems or some corner cases.

Hello, thank for your answer. But, in my opinions, it's not caused by some numerical problems or some corner cases.

I used different version of mmdet3d, and I asked other friends to visualize the results after multi-sweeps fusion. We all get same results.
I think this is caused by object moving. The following is the illustration. We can see that moving objects will lead to this situation.
If we look closely at the static objects (tree or traffic signs) in the distance, we can find that they are correct.

So, I think this multi-frame fusion is correct. However, I want to realize 3D object detection based on range image. The problem mentioned above is so serious when do projection because of occlusions caused by fusion multi frame together. Do you have any advice or will mmdet3d support range-based detection ?

Tai-Wang commented 1 year ago

Points of moving objects definitely appear trailing because the alignment across different frames can just compensate for the ego-motion. It is this trailing phenomenon that may help the model to predict object motion. This problem, including trailing caused by the object motion and occlusion, I think, should be similar for BEV or range-view-based methods, and that is also why aggregating multiple frames can work in LiDAR-based methods. Range-based methods are in the future plan of MMDet3D, but we do not find many promising open-source works for now and we may support this stream of methods when we have enough resources and references (such as FCOS-LiDAR).

open-mmlab / mmdetection3d