google-research-datasets / Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.24k stars 263 forks source link

Error in annotation for cereal box #13

Closed DeriZSY closed 3 years ago

DeriZSY commented 3 years ago

Hi, I use the codes provided in the repo to generate images with point cloud and annotation overlay.

The results look quite convincing on bike category but seems to fail in a cereal_box sequence: batch-11-17 (actually it fails for at least first two sequences of 'cereal_box')

image

image

DeriZSY commented 3 years ago

I think I've found the reason for this. The system set up in ARCore assumes the phone will be held upright. That is to say, you need to have the height of the image longer than the width (as in the bike sequence).

However, when reading the video with OpenCV this is not guaranteed. So you will need manually to rotate the image by 90 degrees in this case to make the height of the image the loger side.