facebookresearch / votenet

Deep Hough Voting for 3D Object Detection in Point Clouds
MIT License
1.67k stars 377 forks source link

What does basis, coeffs, orientation mean in SUNRGBDMeta3DBB_v2.mat? #133

Open yhk1515 opened 2 years ago

yhk1515 commented 2 years ago

I wonder what basis, coefs and orientation means in SUNRGBDMeta3DBB_v2.mat. I currently have centroid, length, width, height, rotations information of 3D bounding box using 3D labeling tool. How can I get coefs, orientation, basis values? Also, do you need basis, coffs, orientation information to learn? Please help me

.

rXYZkit commented 2 years ago

in class SUNObject3d(object): these are used. If you have centroid, length, width, height, rotations information you can just pass them to the class class SUNObject3d(object): orientations are used to calculate heading_angle

yhk1515 commented 2 years ago

in class SUNObject3d(object): these are used. If you have centroid, length, width, height, rotations information you can just pass them to the class class SUNObject3d(object): orientations are used to calculate heading_angle Thanks so much for letting me know. I also know the coordinates of the camera intrinsic parameters in the calib file when training, but I don't know the meaning of the rest values. If you know, can you tell me? I would like to know if these values have a huge impact on learning

rXYZkit commented 2 years ago

in class SUNObject3d(object): these are used. If you have centroid, length, width, height, rotations information you can just pass them to the class class SUNObject3d(object): orientations are used to calculate heading_angle Thanks so much for letting me know. I also know the coordinates of the camera intrinsic parameters in the calib file when training, but I don't know the meaning of the rest values. If you know, can you tell me? I would like to know if these values have a huge impact on learning

with pleasure. there are 2 lines in the calib file. as showed in class SUNRGBD_Calibration(object): the first line is Rtilt = np.array([float(x) for x in lines[0].split(' ')]) and I think this is used to make sure the objects in upright depth coordinate: pts_3d_upright_depth = np.transpose(np.dot(self.Rtilt, np.transpose(pts_3d_depth)))
I think this values comes from:

in the paper of sunrgbd: 2.5. Ground truth annotation For each RGB-D image, we obtain LabelMe-style 2D polygon annotations, 3D bounding box annotations for objects, and 3D polygon annotations for room layouts.

For 3D annotation, the point clouds are first rotated to align with the gravity direction using an automatic algorithm. We estimate the normal direction for each 3D point with the 25 closest 3D points. Then we accumulate a histogram on a 3D half-sphere and pick the maximal count from it to obtain the first axis. For the second axis, we pick the maximal count from the directions orthogonal to the first axis. In this way, we obtain the rotation matrix to rotate the point cloud to align with the gravity direction. We manually adjust the rotation when the algorithm fails. but I'm not sure.