google-research-datasets / Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.23k stars 263 forks source link

Faulty annotations in 2D #65

Open weders opened 2 years ago

weders commented 2 years ago

Thanks a lot for releasing this great dataset!

While parsing the 2D projections of the 3D bounding boxes using the provided notebook, I realized that the annotations are not correct for some objects and poses.

E.g. for scene bottle/batch-6/7 and frame id 273 the projection is quite off (see image attached), while for other frames in the same scene the bounding box is perfectly fine.

Do you have any idea, whether this is caused by the projection or in the annotation of the 3D bounding box? Thanks!

I obtained this results by running the linked notebook above for scene bottle/batch-6/7 and frame_id = 273.

image

ahmadyan commented 2 years ago

If I have to guess, likely a loop closure event in the SLAM system caused the camera poses to jump.

We annotate in 3D, and we rely on the camera poses to project them to 2D images. The camera poses are obtained from 1P online SLAM systems (e.g. ARKit on iPhone). The SLAM systems usually have the loop closure feature that they might optimize the camera poses and reduce drift. The side effect would be a jump in the camera pose trajectory. Which would introduce these artifacts in the dataset.

When filtering the output data, we tried to get rid of these videos that have drift in them, but a few of them eventually did get past the QA.