waymo-research / waymo-open-dataset

Waymo Open Dataset
https://www.waymo.com/open
Other
2.61k stars 604 forks source link

The projected_lidar_labels may have redundant 3d object labels who is occluded by other objects #72

Closed xmyqsh closed 4 years ago

xmyqsh commented 4 years ago
285   // Lidar labels (laser_labels) projected to camera images. A projected
286   // label is the smallest image axis aligned rectangle that can cover all
287   // projected points from the 3d lidar label. The projected label is ignored if
288   // the projection is fully outside a camera image. The projected label is
289   // clamped to the camera image if it is partially outside.
290   repeated CameraLabels projected_lidar_labels = 9;

I think the projected lidar labels generation described above is not enough. Image such situation: The 3d lidar object label constrained and cut in the camera frustum view may be occluded by other objects because of the existence of the translation between lidar and camera.

peisun1115 commented 4 years ago

I don't understand the question. The projection does not take care of image occlusion. If you want to use tightly fitting image boxes, try the image label.

xmyqsh commented 4 years ago

@peisun1115 What I mean is that in the camera view the original lidar 3d objects may stack together(fewer situation). It is best to have an image to illustrate that, but I have not found out a dense object image. We could use the image 2d label to help filter out the stacked behind objects by ourselves. Or, the waymo official could use this filter step to update projected_lidar_labels.

yeyewen commented 4 years ago

I don't understand the question. The projection does not take care of image occlusion. If you want to use tightly fitting image boxes, try the image label.

@peisun1115 Hi, The projected_lidar_labels is always bigger than image label. With my understanding,the projected_lidar_labels should be tightly fitting image boxes in an ideal state. If not,what caused this ?

peisun1115 commented 4 years ago

@yeyewen

projected_lidar_labels is generated by projecting the 8 corners of 3d labels. That is why it is not tight in 2d. A tighter box can be generated by projecting all 'surface points' of the given object to 2d image. But we don't have those surface points as part of are occluded

xmyqsh commented 4 years ago

@peisun1115

That it is! There is something lidar could see but the camera cannot, and vice versa. The annotation of 2d is necessary.

But if we ignore the too close objects as Disentangling Monocular 3D Object Detection does, this problem would be alleviated a lot.

Lots of occlusions are caused by too nearby objects.