Issue: Interpreting the attributes of 3D Bounding box

Hi @yinyunie, Thanks for your amazing research and contribution to the Spatial understanding domain. I am currently stuck at a doubt while interpreting the output of the source code in correlation to the output discussed in the paper. The description of which has been summarized below. It'd be great if you can respond to it.

As far as I can see the output of the 3D object detection Network is given as bdb_3d.mat which seems like a dictionary with the following keys for each instance of object detected by the 2D object detection network. 1.'basis' 2.'coeffs' 3.'centroid' 4.'classid'

The basis seems to be the Rotational matrix of the bounding box (R 3*3) from which we can get the Euler angles in the closed subset of -pi to pi, what does the coeffs and centroid in the mat file signify ?

Please refer the cropped section 3.1 from the research paper attached below which says any 3D bounding box in the world coordinate system is defined by C,s and theta.

Which of the aforementioned keys in bdb_3d.mat correspond to C abbreviated as 3D Center and s abbreviated as spatial size ?

Thanks, anticipating a response.

3DObjectDetection

GAP-LAB-CUHK-SZ / Total3DUnderstanding

Issue: Interpreting the attributes of 3D Bounding box #20