OpenDriveLab / OpenLane-V2

[NeurIPS 2023 Track Datasets and Benchmarks] OpenLane-V2: The First Perception and Reasoning Benchmark for Road Driving
https://proceedings.neurips.cc/paper_files/paper/2023/hash/3c0a4c8c236144f1b99b7e1531debe9c-Abstract-Datasets_and_Benchmarks.html
Apache License 2.0
541 stars 65 forks source link

Misleading definition of camera extrinsic parameters #103

Closed Wolfybox closed 6 months ago

Wolfybox commented 6 months ago

Hi, OpenLane-V2 is a great dataset and I have been recently working on it to do some experiments. However, during data processing and visualization, I notice that the definition of camera extrinsic parameters could be quite misleading.

In OpenLane-V2, the extrinsic parameters, which consists of rotation matrix R and translation vector t, actually represent transformation from camera frame to ego frame, i.e, cam2ego. However, according to the general definition of extrinsic parameters matrix and also the convention of nuScenes dataset, the definition of R and t of extrinsic params should be the transformation from ego frame to camera frame, i.e, ego2cam.

Since OpenLane-V2 is built upon Argoverse2, I wonder whether similar issues appear in av2 as well.

I found this issue when I was trying to visualize centerline to the front-view image using my own codes and just could not get it right. After examining the visualization script you provide (openlanev2.centerline.visualization.pv.py), I notice the difference in definition of extrinsic params.

For reference:
[1] Definition of extrinsic params: https://en.wikipedia.org/wiki/Camera_resectioning. [2] nuScenes dataset.

sephyli commented 6 months ago

Thank you for the valuable feedback. We are using the sensor2ego as our extrinsic parameters because there is a huge community in AD that is familiar with and using the sensor2ego transformation.

  1. We directly inherited the definition from Argoverse 2. "the sensor’s pose in the egovehicle coordinate system", which means the transformation is sensor2ego.

  2. Based on code in mmdet3d and nuScenes, "All extrinsic parameters are given with respect to the ego vehicle body frame", which also means they are using sensor2ego transformation. Based on that, I would not say the ego2cam is a convention of nuScenes.

  3. Also in Waymo Open Dataset, "extrinsic: Trasformation from camera frame to vehicle frame".

Thanks for your correction on the definition of camera extrinsic from the wikipedia. However, we would not say this is a misleading 😄 and we plan to keep the definition as now following the common sense in the AD community. We will further add a specific description in our documents for clarity.

Wolfybox commented 6 months ago

Thank you for the valuable feedback. We are using the sensor2ego as our extrinsic parameters because there is a huge community in AD that is familiar with and using the sensor2ego transformation.

  1. We directly inherited the definition from Argoverse 2. "the sensor’s pose in the egovehicle coordinate system", which means the transformation is sensor2ego.
  2. Based on code in mmdet3d and nuScenes, "All extrinsic parameters are given with respect to the ego vehicle body frame", which also means they are using sensor2ego transformation. Based on that, I would not say the ego2cam is a convention of nuScenes.
  3. Also in Waymo Open Dataset, "extrinsic: Trasformation from camera frame to vehicle frame".

Thanks for your correction on the definition of camera extrinsic from the wikipedia. However, we would not say this is a misleading 😄 and we plan to keep the definition as now following the common sense in the AD community. We will further add a specific description in our documents for clarity.

Thanks for explanation. Agree. A specific description to the extrinsic would be very helpful. 😄