google-research-datasets / Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.23k stars 263 forks source link

2D bounding boxes #46

Open finnweiler opened 3 years ago

finnweiler commented 3 years ago

Is there a way to extract accurate 2D bounding box data from the 3D bounding boxes?

jianingwei commented 3 years ago

You can fit a 2D bounding box to the projections of the 3D bounding box vertices.

finnweiler commented 3 years ago

You are correct, I can extract a rough 2D bounding box by drawing a box around the 3D box projection (marked in red). My actual goal is to have an accurate 2D bounding box (marked in green).

screenshot

jianingwei commented 2 years ago

I'm unaware of an easy way to find the tight green bounding box from the 3D bounding box. Alternatively you can find a rotated bounding box that's fits the object tightly.