google-research-datasets / Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.24k stars 263 forks source link

Why the annotated keypoints are sometimes very small or huge? #78

Open futakw opened 1 year ago

futakw commented 1 year ago

https://github.com/google-research-datasets/Objectron/blob/c06a65165a18396e1e00091981fd1652875c97b5/objectron/dataset/graphics.py#L34-L40

As described here, I think the "keypoints.point_2d.x" or "keypoints.point_2d.y" should be in the range of 0 to 1 if x/y are inside the image. However, I observe that sometimes those are extremely small or huge.

For example, "~/bike/batch-11/5/annotation.pbdata" has

keypoints {                                                                                                                                                                                               
    id: 2                                                                                                                                                                                                   
    point_3d {                                                                                                                                                                                              
      x: 1.2235139608383179                                                                                                                                                                                 
      y: 1.1318135261535645                                                                                                                                                                                 
      z: 0.07509636878967285                                                                                                                                                                                
    }                                                                                                                                                                                                       
    point_2d {                                                                                                                                                                                              
      x: -15.717081069946289                                                                                                                                                                                
      y: -12.666918754577637                                                                                                                                                                                
      depth: -0.07509636878967285                                                                                                                                                                           
    }                                                                                                                                                                                                       
  }  

with extremely small x and y.

ahmadyan commented 1 year ago

You can discard the point when the depth is negative (i.e. the point is behind the camera).