argoverse / argoverse-api

Official GitHub repository for Argoverse dataset
https://www.argoverse.org
Other
833 stars 239 forks source link

Tracking sample for ring_front_center #279

Open zeina-abuaisheh opened 2 years ago

zeina-abuaisheh commented 2 years ago

Hello,

I am very interested in argoverse, I started with the ring_front_center in the tracking sample zip file, but I don't know where its annotations are. For instance, I opened "ring_front_center_315978406032859416.jpg", and I was searching for an annotation file with the same name, but I only found "city_SE3_egovehicle_315978406032859416" which has the translation and rotation of the vehicle. Thus my naive question is: Where can I find the corresponding annotations of this frame (i.e. x, y and z of the objects)?

Thanks

johnwlambert commented 2 years ago

Hi, thanks for your interest in our work. The labels are found inside a folder for each log named per_sweep_annotations_amodal. The dataloaders load from this folder (see the SimpleArgoverseTrackingDataloader). The code that parses the JSON files containing annotations is here.

Please take a look at the tutorials where we show how to use the API to load the labels: Argoverse 3d Tracking Tutorial Cuboid Visualization Script

The coordinate system convention for labels are described here in the README.

zeina-abuaisheh commented 2 years ago

Thanks so much John, this is very helpful for me.

I have one more questions: Where can I find the distance between vehicles and our ego-vehicle?

Thanks so much

johnwlambert commented 2 years ago

Please take a look at the following section of our README: https://github.com/argoai/argoverse-api#a-note-regarding-coordinate-transforms

If an object annotation is includes the pose of the annotated object in the egovehicle frame, e.g. an egovehicle_SE3_object transformation, then this SE(3) object represents (R,t) where l2norm(t) is the distance to that object from the center of the rear axle of the AV.

https://github.com/argoai/argoverse-api/issues/277 Discusses how to find calibration info.

zeina-abuaisheh commented 2 years ago

Thanks so much for your detailed and quick answers. I checked the class ObjectLabelRecord and its translation is represented as the center of the box given as x, y, z, Is the vertical distance of the object from the center of the rear of the AV is l2norm(z) and the horizontal distance is l2norm(x)?

For instance, in the annotations of Kitti, we have both the vertical distance (z) and the horizontal distance (x).

johnwlambert commented 2 years ago

Hi @zeina-abuaisheh, please take a look at figure 3 of our Arxiv paper that describes the coordinate system for the egovehicle (lower left hand corner of the following figure)

Screen Shot 2021-10-28 at 12 51 58 PM

If we use a flat world assumption, then the horizontal distance would be in the xy plane, i.e. sqrt(x2 + y2) and vertical distance w.r.t. center of the rear axle would just be abs(z). However, that flat-world assumption is often violated. But I think those are the numbers you're looking for.

zeina-abuaisheh commented 2 years ago

thanks so much for everything, now it's very clear for me.

My last question is about the camera parameters (which I found in "vehicle_calibration_info"): focal_length_xpx, focal_length_ypx, focal_center_xpx, focal_center_ypx, distortioncoefficients and vehicle_SE3camera where can I find camera height, roll, pitch and yaw?

johnwlambert commented 2 years ago

So the SE(3) transformation vehicle_SE3_camera describes the pose of any particular camera in the egovehicle coordinate frame. In other words, it parameterizes (R,t) such that t represents the coordinates of the camera in egovehicle coordinate system. Thus, it directly encodes the height of the camera in z, since z points upwards.

As for roll, pitch, yaw, we do not use a Euler angle parameterization for R to avoid gimbal lock. But the rotation for (R,t) mentioned above for vehicle_SE3_camera is parameterized by a quaternion in our dataset. You can convert that to roll, pitch, and yaw with your favorite library, e.g.

from scipy.spatial.transform import Rotation
rz, ry, rx = Rotation.from_matrix(vehicle_SE3_camera.rotation).as_euler('zyx', degrees=True)

I highly recommend you use the API directly to work with the calibration, to avoid making any mistakes. Please use the Calibration class here to work with the camera calibrations.

Here's an example to try out: https://github.com/argoai/argoverse-api/blob/master/demo_usage/cuboids_to_bboxes.py#L192

zeina-abuaisheh commented 2 years ago

That makes sense, thanks a lot John, you helped me a lot with all of my questions. I am very grateful for your help.