Open jamesheatonrdm opened 1 year ago
I have the same confusion,The alpha of the KITTI dataset is not the Theta L in the paper,and in the code torch_lib/Dataset.py alpha = Alpha = line[3] .i found another code https://github.com/shashwat14/Multibin/blob/master/data_prep.py , It's another way of calculating Theta L
This code calculates the Theta in a different way than the paper,it use "r_y= alpha +theta" in KITTI
hey do you know the loss may be lower than 0 ,is it normal and acceptable?
Hi,
I am attempting to use this method to train on my own dataset which I have generated in Unity using the Unity Perception Package, therefore this requires quite a few modifications of the Dataset class. Unity will generate the ground truth and provide me with the following:
X,Y,Z position of the 3D bounding box center wrt. the camera Object dimensions Object rotation wrt. global coordinate frame 2D bounding box coordinates within the image Camera intrinsic matrix
In the corresponding paper, the three angles of interest are Theta Ray, Theta L, and Theta. I believe understand what these are and the correspondance between them:
Theta ray is the ray angle of the object center (calculated as the angle between the camera principal point and 3D bounding box center). Theta L is the local orientation i.e. orientation of object wrt. to the camera. Theta is the global orientation of the object. Theta = Theta Ray + Theta L
However, looking in the Dataset class, there are references to three different angles: Alpha, Ry and theta_ray. As far as I understand it, Alpha is equivalent to Theta L (as this it what you are regressing), Ry is equivalent to Theta (global orientation), and theta_ray is self-explanitory.
As far as I am aware, theta_ray is calculated using the position of the 2D bounding box within the image, and the model is predicting Alpha, and using the correspondance between these we can find the global orientation of the object.
I would just like to confirm that all this is correct, as I have been having a hard time understanding this. Your feedback is greatly appreciated :)