open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.13k stars 1.52k forks source link

KITTI doesn't account for tilt angle? Strange limitation? #1510

Open Steven-m2ai opened 2 years ago

Steven-m2ai commented 2 years ago

Hello, I am working with a case in which the camera is mounted above a box-like object, at a tilted angle (see image below). The blue box here represents the object in the camera coordinates (the camera is angled slightly downwards). The red box is what the mmdetection3d builds when i supply the center point, and height width length information in KITTI format. (center point denoted as the blue point I). I notice that KITTI does not take into account the extrinsics since we supply the points in camera coordinates directly. Is there a way to leverage the tilt information in the code that's provided? It seems like a strange limitation to have? image EDIT: I have confirmed this is a limitation of KITTI format. could I be directed towards/given any suggestions for a different labeling format that takes roll pitch and yaw into account? Or, perhaps even more interesting is how to edit the current code (FCOS3D training) to account for this?

For reference I am working with monocular detection with FCOS3D using MMDetection3D

VVsssssk commented 2 years ago

@ZCMax Hi. Please reply to this issue.

Steven-m2ai commented 2 years ago

i am currently looking at modifying nuScenes to fit this problem. From my understanding nuScenes has quaternion rotations and thus I would imagine can take into account this pitch and roll limitation. Please correct me if I am wrong. If there is any way to edit the KITTI code, that would be very interesting to see still

ZCMax commented 2 years ago

I think you can try on Nuscenes first for fast check, if you have any progress or problems, feel free to leave comments here.

Steven-m2ai commented 2 years ago

Hello, I am currently trying to make a custom NuScenes dataset format dataset. However, when i am running create_data.py, it looks like inside the coord_3d values return the correct 3D coordinates in camera space. However, when we build the json file it reduces this information to 7 degrees of freedom, only accounting for a yaw angle rotation. I wonder if maybe then the limitation is in the label creation of mmdetection3d itself.

Steven-m2ai commented 2 years ago

As a concrete example, here is a dummy example of what i am talking about. My question is does someone know if MMDetection3D monocular (either in KITTI or NuScenes format) detection can support this kind of case? Currently with only YAW angle information, browse_dataset.py returns this bounding box, which clearly is not correct. Maybe @ZCMax could help me out here? Screenshot from 2022-05-31 17-54-05 Do you think maybe finding some way to pass in the plane information such that _createdata.py can use the -with-plane argument? Plane information should account for this weird pitch rotation i need?

salvaba94 commented 1 year ago

Hi all, I'm experiencing a similar problem to @Steven-m2ai . I need all 9 degrees of freedom flowing into the model. In the end, did you reach a conclusion on this topic?

Steven-m2ai commented 1 year ago

kitti and nuscenes from my understanding does not support this. this is because in autonomous driving application objects only contain a yaw angle, as the camera is mounted parallel to the ground plane.

salvaba94 commented 1 year ago

Thanks for the answer @Steven-m2ai . Does this mean that the only way to achieve this is to implement a new dataset?